Where CSM excels is in very large models or very large terrains where there is no good scene-graph based decomposition. For example, ad-hoc shadows on terrain by decomposing sub-parts of the mesh works rather poorly - decompositions really need to be based around view frustums (which is exactly what CSM does), not based on world coordinates.
One thing I haven't been able to quantify yet is the cost in triangle count to CSM. If you look carefully at the CSM scheme, you'll see that at certain camera vs. sun angles, there can be significant overlap between the shadow volumes, and that means multiple iterations over the scene graph content that is in the "shared" location. If the model in that location is expensive, this can be a potential performance problem.
(For example, if you work on, oh I don't know, a flight simulator, there is a chance that the user's airplane is significantly more expensive than anything else in the universe...if it spans several CSM volumes, you're going to feel the pain.)
Finally, from what I can tell, while it is more efficient (fill-rate wise) to apply all CSM volumes at once, it is not strictly necessary from a quality standpoint - I'm not seeing a ton of artifacting from applying CSM volumes separately via stenciling.
(The artifacts would be from the overlap of low and higher res shadow maps..what I have found is that if the CSM scheme uses enough splits to really look good, the overlap regions are small and don't differ that much in quality between the lower and higher res shadow map that overlap.)
nVidia's CSM demo uses four shadow maps -- my tests required six at first, but upon examination, it looks like the closest and farthest map are too close and too far to be useful, and could be dropped.