Then why go through the trouble of S & D variants at all? Higher production of fewer parts would allow them to lower per unit manufacturing costs.I think it is possible... you could run a SP and a 36D connected to another 36D thus giving you a full 4 x16 PCIe slots... From what I recall from the literature, the QPI are all negotiated/setup at startup and devices all identify themselves, etc. Thus a SP on a 36D chipset should run just fine... The unused link on the 36D will just not initiate.
I can't help but think the there's a specific reason the QPI channels required separate P/N's, and it seems the chipset can't disable it's own QPI if an unused channel exists. As I understand it, negotiation control lies in the CPU's, not the chipsets. The SP version (CPU) doesn't have the second QPI to disable any that may exist on the chipset, if a D version. Full connections are still dictated for the chipsets (negotiation of channel B can't go through channel A).
If I had to guess, it was to make the design simpler, and use less wafer in order for higher part/wafer counts.
The '09 MP is only using a single 36S or 36D BTW.