Many motherboard manufacturers have better wiring for PCIe lanes, Apple is just being sloppy. I am not talking about Dell or HP or the likes. They are even worse than Apple. Take Asus for example. They do a nice job on motherboards, so does Gigabyte. No need for switches.
Chuckle.....
ASUS Z8PE-D12X board
http://www.asus.com/product.aspx?P_ID=gGozRAk0YWQCQtSA
"...
Total PCI/PCI-X/PCI-E Slots: 6
Slot Location 1: 1 * PCI-X 100/133 MHz
Slot Location 2: 1 * PCI-X 100/133 MHz
Slot Location 3: 1 * PCI-E x16 (Gen2 x8 Link)
Slot Location 4: 1 * PCI-E x16 (Gen2 x8 Link)
Slot Location 5: 1 * PCI-E x16 (Gen2 x8 Link)
Slot Location 6: 1 * PCI-E x16 (Gen2 x16 Link) (Auto switch to x8 Link if slot 5 is occupied)
Slot Location 7: 1 * MIO Slot for Audio card (PCI-E x1 is not supported)
..."
Ooooooo looky-looky ..... a switch.
Toss in the standard Apple design constraint not to add "old legacy" tech to newest models ( so loose PCI-X slots ..... only going to have old legacy cards with old legacy drivers anyway) and you have exactly 4 PCI-e slots. As I said, all the other vendors are operating with the same 36x lane constraint. It is a simple matter of doing some straightforward arithmetic to also see that they are also using switches if their PCI-e slot count is substantially higher and/or superficial bandwidth numbers are higher. Apple is just not perpetrating more bandwidth than is actually in the box.
If you want a workstation with a 16x graphics card and a 8x 10GB SAN card that you want to run full blast then the Mac Pro is a better design tradeoff than that Asus board. If you want to add 3 8x cards to your box to hook to gobs of very fast direct attached storage (DAS) quickly then the Asus board is better. One isn't necessarily better than the other. Depends upon what doing. In the Mac Pro market I suspect there are going to be far more folks using two 16x cards and 1 16x and 1 8x cards than those that need 3 8x's .
It is a reasonable design tradeoff.
The ICH10R has more than x1 PCIe lanes available, in fact it has 6 x1 lanes, of which you can configure them into x4 and x1 x1 configurations. Also, allowing 1GB/s of data thru the x4 is better than having to share it thru some latency ridden switch.
I just didn't explicitly quote a long list of consumers because I apparently suffered from delusions that folks would go take hard look at the specs.
Typically the PCI-X and PCI slots are consumers of that 6. Firewire is. USB 3.0 is. An extra onboard SATA/RAID controller often is (or increases the switch usage in the "upper" x36 lanes *** ) , etc. If you look for any "value added" feature on the motherboard that is not part of the core chipset, then a large percentage of the time it is a consumer of some number of that 6.
Sure, Apple probably has 2-3 lanes not hooked up. However, there is also no clear quantitative evidence that there isn't a bottleneck in the overall PCI-e switch. The chipsets are not generally designed to run everything on all possible channels at full blast.
The Mac Pro design doesn't lend itself to carrying forward "old" cards. It lacks in backwards flexibility. However, unless you using a single Mac Pro to drive a multimillion visual simultor or some large DBMS workload (with high DAS bandwidth requirements ), it fits a broad spectrum of workstation users.
*** "upper" relative to this diagram http://en.wikipedia.org/wiki/File:X58_Block_Diagram.png