I think there's no doubt about that. But in no way does that mean that everyone should conclude that 3 sticks are better than 4 in their own case.
Let's say that populating the 4th slot reduces the bandwidth by around 25% because it drops from 3-way to 2-way interleave when your main-memory subsystem is running flat-out. So?
How often is your memory subsystem running flat out, given the unique, new, and relatively enormous cache system of the Nehalem/Westmere architecture? How much is that drop reflected in your
real overall system performance? How much is it offset in your particular system by having an extra 4GB of RAM? I assure you that for every synthetic benchmark you can come up with showing a slowdown of something that basically never happens in practice (your super-high-speed memory system getting saturated), Kingston and Crucial can come up with 5 more showing that the extra 4GB in the fourth slot improves your performance on XYZ applications even though it cuts down your interleave factor.
It is true that your maxed out memory scenario runs best with a multiple of 3. And IF it is also true that you're fine with 12GB instead of 16 with a single CPU, then maybe you should only populate 3, that's what I'm doing.
But, the more I read about this stuff (give
this a long slow glance if you want some detail), the more I realize what the truth is:
For 99.98% of users the 3-sticks vs. 4 sticks controversy simply doesn't matter in real-life practice. It's a non-issue that's easy to mistake for an issue. 'Do what thou wilt' shall be the whole of the law. If you want 3, get 3. If you want 4, get 4. Ignore the many over-hyped benchmarks aimed at arousing the paranoia wired into our lizard brains. The difference is a few percent either way in real life situations on the memory access, and you'll be helped by more memory more often than you'll be hurt by less interleave.
Bottom line: we are indeed getting all over ourselves for no reason at all. The right thing to do: forget the whole issue.
EDIT: Nothing about the above mitigates the indisputable fact that it's kind of a shame to design a motherboard for these CPU's that has anything other than a multiple of 3 memory sockets.