Just a little math check here:
A single Thunderbolt channel provides 10 Gbps of bandwidth to the protocol layer (we'll ignore the full-duplex aspect, since only marketeers would use that to make the numbers look bigger).
A single lane of PCIe 2.0 has a nominal data rate of 5 Gbps, but 8b/10b encoding reduces the actual bandwidth available to the protocol layer to 4 Gbps.
So 1 Thunderbolt channel = 2.5 PCIe 2.0 lanes.
The currently available (or not so available) Thunderbolt controllers only have connections for 4 PCIe 2.0 lanes, so the total PCIe bandwidth available to all channels connected to a single Thunderbolt host controller is limited to 16 Gbps.
If the Thunderbolt controller pulls its PCIe lanes off of the PCH instead of directly off the CPU (Apple has shipped both configurations), the DMI 2.0 link between the CPU and PCH could potentially introduce additional latency and/or bottlenecking, because it too is limited to 16 Gbps (20 Gbps less 8b/10b overhead).
This also points out why Thunderbolt is not inherently slower over copper than fiber. The controller is limited by its back end, not the cable. The next likely step forward will be to bump the PCIe connections up to PCIe 2.0 x8 or PCIe 3.0 x4, and the DisplayPort connections up to 2 x DP 1.2. This would probably go hand in hand with a doubling of channel bandwidth to 20 Gbps, and that might call for fiber.
Most mainstream CPUs these days provide 16 PCIe 2.0 lanes directly off of the CPU for use by discrete GPUs. This is 64 Gbps total, which is 6.4 times as much bandwidth as a single Thunderbolt channel.
How much bandwidth does a GPU actually need? Most of the tests that I've come across, which are based on either synthetic benchmarks or frame rates in various games, show little to no impact going between PCIe 2.0 x16 and x8. Signs of throttling start to become more evident when you drop down to x4 and are obvious at x1. I also noted that the throttling seems to be most prevalent in games where the GPU can achieve insane frame rates (> 100 fps). When calculating a GPUs potential bandwidth usage, don't forget that not only can they sometimes render 120+ fps, but that they are often using multiple additional channels for compositing and FX, and that several current GPUs can theoretically drive up to 6 2560x1600 displays at bit depths up to 32 bpp. Aside from the extreme cases though, a Thunderbolt connected GPU driving a single 1920x1080 display could probably provide a way better gaming experience than the on die Intel HD 3000 graphics of the MacBook Air.
Thunderbolt devices are only supposed to use a single channel, in order to assure bandwidth to devices further down the chain, but I don't see anything theoretically stopping one from making a device that uses both channels (aside from Intel's displeasure and the scarcity of Thunderbolt controllers at this time.) You would most likely need to use two TB host controllers in the device to achieve this. I have no idea what would happen if you just connected a Mac with two TB ports to a device with two TB ports using two cables instead of one.
A single Thunderbolt channel provides 10 Gbps of bandwidth to the protocol layer (we'll ignore the full-duplex aspect, since only marketeers would use that to make the numbers look bigger).
A single lane of PCIe 2.0 has a nominal data rate of 5 Gbps, but 8b/10b encoding reduces the actual bandwidth available to the protocol layer to 4 Gbps.
So 1 Thunderbolt channel = 2.5 PCIe 2.0 lanes.
The currently available (or not so available) Thunderbolt controllers only have connections for 4 PCIe 2.0 lanes, so the total PCIe bandwidth available to all channels connected to a single Thunderbolt host controller is limited to 16 Gbps.
If the Thunderbolt controller pulls its PCIe lanes off of the PCH instead of directly off the CPU (Apple has shipped both configurations), the DMI 2.0 link between the CPU and PCH could potentially introduce additional latency and/or bottlenecking, because it too is limited to 16 Gbps (20 Gbps less 8b/10b overhead).
This also points out why Thunderbolt is not inherently slower over copper than fiber. The controller is limited by its back end, not the cable. The next likely step forward will be to bump the PCIe connections up to PCIe 2.0 x8 or PCIe 3.0 x4, and the DisplayPort connections up to 2 x DP 1.2. This would probably go hand in hand with a doubling of channel bandwidth to 20 Gbps, and that might call for fiber.
Most mainstream CPUs these days provide 16 PCIe 2.0 lanes directly off of the CPU for use by discrete GPUs. This is 64 Gbps total, which is 6.4 times as much bandwidth as a single Thunderbolt channel.
How much bandwidth does a GPU actually need? Most of the tests that I've come across, which are based on either synthetic benchmarks or frame rates in various games, show little to no impact going between PCIe 2.0 x16 and x8. Signs of throttling start to become more evident when you drop down to x4 and are obvious at x1. I also noted that the throttling seems to be most prevalent in games where the GPU can achieve insane frame rates (> 100 fps). When calculating a GPUs potential bandwidth usage, don't forget that not only can they sometimes render 120+ fps, but that they are often using multiple additional channels for compositing and FX, and that several current GPUs can theoretically drive up to 6 2560x1600 displays at bit depths up to 32 bpp. Aside from the extreme cases though, a Thunderbolt connected GPU driving a single 1920x1080 display could probably provide a way better gaming experience than the on die Intel HD 3000 graphics of the MacBook Air.
Thunderbolt devices are only supposed to use a single channel, in order to assure bandwidth to devices further down the chain, but I don't see anything theoretically stopping one from making a device that uses both channels (aside from Intel's displeasure and the scarcity of Thunderbolt controllers at this time.) You would most likely need to use two TB host controllers in the device to achieve this. I have no idea what would happen if you just connected a Mac with two TB ports to a device with two TB ports using two cables instead of one.
Last edited: