We tested the different combinations with a special 64-bit parallel multi-threaded version of STREAM. I averaged the results of Copy, Scale, Add, and Triad to produce an overall speed rating in gigabytes per second:
4 x 2GB = 8GB = 6.5GB/s
8 x 1GB = 8GB = 7.5GB/s
Conclusion: Any combo of matching pairs that fills all 8 slots = fastest.
Thank you Barefeats, you've answered my question. It is an interesting result as to me it is counter-intuitive given that there are only 4 serial channels though given that these operate at a much higher frequency than the memory itself I guess what happens on the modules themselves is more important than being first in line.