Monday, November 30, 2009

Per-Pixel Tangent Space Normal Mapping

This post assumes you know what tangent space normal maps are in general, and roughly how they work geometrically. If you try to code this, you'll hit one tricky problem: how do you convert your normal map from tangent to eye space? You need the normals in eye space to do lighting calculations.

The short answer is: a few lines of GLSL goo do the trick. Before explaining why those lines do what they do, we need to understand basis vectors.

Basis Vectors

Basically you can convert a vector from a source to a destination coordinate system like this:
new_ = [ dot(old, X), dot(old, Y), dot (old, Z) ]
where X, Y and Z are basis vectors - that is, X,Y, and Z are the axes of the old coordinate system expressed in the units of the new coordinate system. If you know where the old coordinate system is in terms of the new coordinate system, you can convert.

(A side note: the basis vectors form a matrix that performs this transformation, e.g.
X_x Y_x Z_x
[ X_y Y_y Z_y ]
X_z Y_z Z_z
What does X_y mean? That is the y component of the X axis of the old coordinate system. Because the coordinate systems are not aligned, the old X axis may be a "mix" of the new coordinate system.)

I strongly recommend taking the time to deeply understand basis vectors. Once you understand them, OpenGL matrices start looking like a geometric shape instead of a blob of 16 random numbers.

So if we want to convert our normal map from tangent space to eye space, we need basis vectors - that is, we need to know where the S, T and N axes are in terms of eye space.

(A side notation note: typically these are known as the Tangent, Bitangent and Normal vectors for a TBN matrix. When I say "S" and "T" I mean the direction on your model that matches the horizontal axis of your texture and vertical axis of your texture.)

We already know "N" - it's our normal vector, and we usually have that per vertex. Where do we get S & T?

The Derivative Functions

GLSL provides derivative functions dFdx and dFdy. These functions return the change in value of an expression over one pixel on the screen horizontally and vertically. You can pass anything into them. If you pass in a texture coordinate, you can find out how much your UV texture will change as you move right or up one pixel. If you pass in a location, you can find out how much your location will change per pixel. If we pass in a vector, the "derivative" is performed on each component.

My understanding is that on some hardware, pixel shaders are run on a block of 2x2 pixels. The same instruction is run 4 times for 4 sets of input data. When the shader hits the "derivative" function, it simply takes the difference between the emerging values in each of the blocks to find the deltas.

The key points about the derivative functions: we can apply them to anything, and they take derivatives along the X and Y axis in screen space.

Solving Parallel Equations

From our discussion of basis vectors we know that we the S and T vectors basically do this:
dx = ds * Sx + dt * Tx
dy = ds * Sy + dt * Ty
dz = ds * sZ + dt * Tz
In other words, they map from UV coordinates to eye space.

Don't we need three basis vectors? Well, yes. The third component would be dn * Nx, etc. But our UV map is effectively flat - that is, it's "N" coordinate is always 0. So we will drop out the third basis for now. (This should be expected - the normal vector is defined as tangent to our mesh, and the UV map is entirely on our mesh.)

So...we have 3 equations of 2 unknowns. For each, if we only had 2 sets of values (e.g. 2 sets of dx, ds, dt) could solve for our basis vectors.

Well we can! If we take the X derivative (dFdx) of our texture coordinates and eye space mesh location, we would get: Q.x (the change of eye space position) ST.s (the change of the S coordinate of our UV map ) and ST.t (the change of our T coordinate in the UV map). And we could do this twice, using dFdx and dYdx to get two separate sets of points.
vec3 q0 = dFdx(position_eye.xyz);
vec3 q1 = dFdy(position_eye.xyz);
vec2 st0 = dFdx(gl_TexCoord[0].st);
vec2 st1 = dFdy(gl_TexCoord[0].st);
Now we have a set of 6 constants to plug in to solve our two unknowns:
dx = ds * Sx + dt * Tx
With the substitution (q0 vs q1 is our derivatives from dFdx and dFdy, etc.):
q0.x = st0.s * Sx + st0.t * Tx
q1.x = st1.s * Sx + st1.t * Tx
We can solve this to come up with:
Sx = ( q0.x * st1.t - q1.x * st0.t) / (st1.t * st0.s - st0.t * st1.s)
Tx = (-q0.x * st1.s + q1.x * st0.s) / (st1.t * st0.s - st0.t * st1.s)
The same trick can be pulled for the X and Y axes - we'll use the same input st0 and st1 derivatives but the y or z components of our object-space derivatives. The resulting six values Sx Sy Sz, Tx Ty Tz are our basis vectors in tangent space.
vec3 S = ((q0 * st1.t - q1 * st0.t) / (st1.t * st0.s - st0.t * st1.s));
vec3 T = ((-q0 * st1.s + q1 * st0.s) / st1.t * st0.s - st0.t * st1.s));
Cleaning It Up

What units are our newly formed basis vectors? Well, they're in the right unit to rescale from UV map units (1 unit = one repetition of the texture) to our eye space units (whatever the hell they are). The main thing to note here is that S and T are almost certainly not unit vectors, but if we we are going to convert a normal map from tangent to eye space, we need unit vectors. So we are going to normalize S and T.

Now we can make an observation: the denominator of S and T (st1.t * st0.s - st0.t * st1.s) is the same for S and T and more importantly the same for all three axes. In other words, that divisor is really a constant scaling term.

Well, if we are going to normalize anyway, we don't really need a constant scaling term at all. Nuke it!
vec3 S = normalize( q0 * st1.t - q1 * st0.t);
vec3 T = normalize(-q0 * st1.s + q1 * st0.s);
When Does This Break Down

You may know from experience that tangent space normal mapping will fail if the author's UV map is zero or one dimensional (that is, the author mapped a triangle in the mesh to a line or point on the UV map). From this equation we can at least start to see why: our denominator was
st1.t * st0.s - st0.t * st1.s
When is this zero? It is zero when:
st1.t / st1.s = st2.t / st2.s
Which is to say: if the direction of the UV map derivatives are the same if we go up the screen or to the right on the screen, our UV map is degenerate and we're not going to get a sane answer.

1 comment:

  1. Thanks, it's really the fastest and easiest way to get correct normal maps :) !

    ReplyDelete