I tested the 3.33GHz 6-core with two different memory configs. Then I ran a 64 bit memory stress test.
4x8G = 32G = 10.5 GB/s
3x8G = 24G = 14.3 GB/s (or 36% faster)
However, when I ran After Effects CS5 project render using 12 cores and 1.5G per core, the 32G config took 72 sec, the 24G took 71 sec. In other words, if there is a difference in real world performance, you would need a stopwatch to detect it.
Why? Because few if any real world apps use the full memory bandwidth. I'm still testing. If I find any app that benefits from the 3 stick "triple channel" config, I'll report it here.
I'm interested in seeing some memory speeds for six and eight 8GB DIMMs on a dual processor system if you could do that. Eight 8GB DIMMs should not be able to run at 1333MHz as they are registered rather than unbuffered. While it won't much matter to anyone who needs that much memory, it would be nice to have it recorded from a respected source.
edit: Actually I'm wondering if four 8GB RDIMMs knocks the speed down to 1066MHz too on the single processor system.