Basically the error you get in shadow maps that cause "acne" (that is, surfaces shadowing themselves, but only sometimes) come from two sources:
- Numeric (e.g. you ran out of bits).
But you'll find that that's not nearly enough offset. Why? Well, the geometric error is the answer.
When you do a shadow map compare, the depth you are reading comes from the depth at the center of the pixel from the shadow map. But because you are projecting the texture, the sample you want isn't the center of the pixel. Since the shadow map defines a surface perpendicular to the sun, as the distance from the shadow map pixel center to the point you wanted increases, you need more bias.
(Note that this means that the lower the horizontal/vertical resolution of your shadow map, the larger this depth-bias problem becomes. Also note that as the Z-buffer distance of your occluder increases - that is, as you shrink the near and far clip planes of the sun to "hug" the occluder, the depth slope per pixel increases and this problem becomes worse too! This is counter-intuitive - you would not expect to need a larger constant bias due to a more precise shadow map.)
And that's why polygon offset is useful: the the bias you need goes up as our occluder starts to slope away from the sun. That is the case where (when self shadowing) the occluder will slope away from the sun in your real model, but the shadow map will have a "perpendicular to sun" shadow over the span of a single pixel. The more sloped, the more bias - perfect for polygon offset.
The big win of polygon offset over a constant bias is that it lets you reduce the amount of bias when the occluder is perpendicular to the sun. This is the base where the geometric error is almost none, and a very small constant bias will be adequate.
As a final note, a number of web pages and discussion forums refer to the "non-linear" nature of the Z-buffer. For integer z buffers, this is only true when you use frustum projection! The non-linearness comes from the matrix transform, not the z-buffer itself; what you're seeing is Z precision get smaller for far away objects just like pixel size gets smaller for far away objects.
What this means is that while you're very likely to have problems with shadow maps based on the depth buffer (that is, post-perspective-divide Z) when the shadow map uses a frustum, it is not an issue at all for orthographic projection. In other words, there shouldn't be a problem with the sun, but there will be with spot lights.