Friday, April 23, 2010

CGAL: It's All About the Mantissa

In a past post I described CGAL as having no rounding errors. It does this by using number types of variable size (using dynamically allocated memory per number!) so that it never runs out of digits. (It also maintains the numerator and denominator of fractions separately to avoid problems with repeating decimals.)

The advantage of this is that geometric algorithms that rely on precise calculations never go haywire due to rounding errors. For example, when using fixed-precision math (e.g. IEEE floats) the intersection of two near-parallel lines will be calculated inaccurately - sometimes with the intersection showing up miles from the original lines. CGAL always has more precision, so it avoids this problem.

But there is one down-side: when you perform a series of intersections, the result is exact numbers whose mantissas (the number of actual digits) have grown very long. And CGAL won't blink about making them even longer as you do more calculations.

Instead CGAL will become insanely slow.

I hit this case the other day. The first piece of processing I do is to combine a whole pile of vector data from OSM into one integrated map. While OSM is not particularly high precision (from a bits standpoint) the resulting intersecting points are calculated "perfectly", sometimes with very large mantissas.

I then wrote a piece of code to take a city block from that OSM map and perform some calculations to find the sidewalk calculation. The problem: the four corners of the city block were already very long numbers since they were the result of a CGAL calculation. Thus a long calculation on a long calculation becomes very slow.

The original algorithm took about 36 minutes for a fully optimized build to find all sidewalks in San Diego. That is way too slow, and unusable for our project.

I the put a rounding stage in: fore each corner of the block, I would convert it to a regular 64-bit IEEE float and then back to CGAL, throwing out any "extra" precision that CGAL was saving. Note that the 64-bit float already gives me better than 1 millimeter precision, which is more than overkill for a road. The algorithm run on the "simplified" data ran in 67 seconds.

Now there is one danger: if, due to mismatched road locations in OSM or conflicting edits, some of the "blocks" were really tiny (less than 1 mm) CGAL would have correctly built that block using infinite precision, and my "rounding" would have incorrectly reshaped those blocks, perhaps turning them inside out or in some other way damaging them.

So a necessary step to productizing this 'resolution reduction' is to do a sanity check on each resulting block. Fortunately most of the time if the block contains too-small-to-use data, we don't need the data in the first place.

Wednesday, April 21, 2010

Constitutional Opposition

One part of this post by Daring Fireball on the iPhone SDK licensing agreement made me chuckle:
If you are constitutionally opposed to developing for a platform where you’re expected to follow the advice of the platform vendor, the iPhone OS is not the platform for you. It never was. It never will be.
It inspired me to come up with a new quotable:
If you are constitutionally opposed to developing for a platform where you’re expected to follow the advice of the platform vendor, you should not be a computer programmer.
See also basically every post by Raymond Chen: "just because, in Win98SE2, you could call SomeRandomWin32API with a combination of NULL, -1, and Bill Gate's IQ and get an undocumented behavior that violates all of Microsoft's guidelines for applications development doesn't mean it will continue to work in Windows 7."

Thank You Jeeves, That Will Be All

The other day I went in to discover why a new piece of scenery code had mysteriously stopped working. Eventually I came to this:
Ah! Now it all makes sense. The code should have read:
After having done a global search, clearly I had hit the space bar by accident, nuking my function call. The charming thing is that C++ doesn't question why I have a giant list of paranthetical "stuff", it just blissfully compiles it into an expression that does...well, pretty much nothing.

Some of my other favorite C++ isms:
case a: do_it(); break;
b: do_x(); break; // no case, not illegal - now "b'' is a label!
defaultl: do_more(); break; // typo in default? That's a label too!
Of course we are all familiar with the fun that emerges from swapping = and ==. And having a stray semi-colon never hurt anything.

Propsman had an apt characterization: C++ is like an overly polite butler. "A...hamburger on the rocks, Sir? Certainly, Sir, I'll bring you one directly..."