Also earlier missed that it also had model labeling consistent with the new ones.
I missed that as well.
Similarly, if increased the current controllers so that there was v3.0 at the controller's switch edges but still capped at 10Gbps then a 4x controller would be grossly oversubscribed on bandwidth. 4x v2.0 is already more. 4x 3.0 would be worse.
Or over provisioned, in the case of a host controller in a PC, but yes PCIe 3.0 x4 is definite overkill if there is still only one 10 Gbps PCIe to Thunderbolt protocol adapter.
What I was thinking was that if the Thunderbolt controller's internal PCIe switch was upgraded to handle PCIe 3.0, it would allow for greater flexibility. On the host side, you could feed a protocol adapter using either PCIe 2.0 x4 or PCIe 3.0 x2. On the device side, unless you use an additional PCIe switch, you can only add as many PCIe based controllers to your design as you have available ports on the Thunderbolt controller's built in PCIe switch. This currently allows for 4 x1, 2 x1 + 1 x2, 2 x2 and 1 x4 configurations. Not that any exist right now, but say a SATA 6Gb/s host controller with a PCIe 3.0 x1 back end could be connected, along with several other lower bandwidth PCIe based controllers, and not necessarily over-subscribe the 10 Gbps protocol adapter.
And wasn't this the same demonstration where the audio output on the daisy chained display started to hiccup? It isn't that the "full 10Gbps" is given to transporting PCI-e. It is really being about practical usage in context where there are more than just one device connected to the port and there are several protocols with varying latencies all trancoded onto PCI-e and then on top of TB. If put 2 or more devices a port and 2 or more of them need some isochronous transfers while some link is trying to push 10Gbps down the wire to a single destination it isn't going to work well.
That was the second test (the ATD one) where the audio issues showed up. And to be fair, the ATD didn't come out until well after the Pegasus, so there was no way for Promise to really test for that. Also, the Pegasus did scale back its throughput as other devices on the ATD requested bandwidth, and the only bit that had issues was a USB audio device. So essentially it boils down to a situation where the USB 2.0 controller in the ATD didn't handle an isochronous data stream very wellnot much of a revelation, really.
Also, as you rightly point out, this was sort of an attempt to "fill the channel up to the brim and claim victory," more than a normal workflow situation. Promise didn't ship the Pegasus full of SF-2281 based SSD's, Anand stuck those in there specifically to push the limits.
All the more so for the reduction to 2 from 4. If already oversubscribed, then 4 is just even more oversubscribed.
That would be nice if that second channel was unused. The pragmatic problem is that is used for DisplayPort traffic. Another reason the TB protocol overhead is so low is that the just physically separate the TB encoded traffic for PCI-e from the TB encoded traffic for DisplayPort.
Intel has specifically stated that both PCIe and DP data can transported on each channel in each direction; they are not physically separated. The proof of this is that you can daisy chain two ATD's off of a single Thunderbolt port and have them both function. This requires 11.6 Gbps of DP data to be sent down a single cable in the same direction, and thus it must be present on both channels, along with whatever PCIe data is required for the other devices in the displays to work.
I think it is likely that Intel is hiding the Thunderbolt overhead from the end user, and that the raw signaling rate is perhaps in excess of 10 Gbps per pair.
All that aside, when you have a 4-channel controller that supports 2 Thunderbolt ports in a PC, it's silly to limit it to 45.923 Gbps maximum aggregate throughput when the PHY can handle 80 Gbps. A second PCIe to Thunderbolt protocol adapter (or a 2-channel adapter) would only bump the total up to 65.923 Gbps, still leaving a fair amount of headroom. To envision a scenario where this might be beneficial, consider a 27-inch iMac with a Pegasus R6 full of SSD's connected to each Thunderbolt port and a pair of 2560x1600 DP displays daisy-chained off of those.
The PCIe x2 connection on the 2C Cactus Ridge variant not only limits the number of controllers you can attach to it without having to resort to an additional PCIe switch, but it also limits the total controller throughput to 24.641 Gbps, even though the PHY can handle 40. Usually the problem with using a wider PCIe connection on a chip is finding room for the additional pins on the package. The 2C chip in this case uses the same package layout as the 4C version which has an x4 connection, so what gives? How is there any cost savings in dropping down to an x2 connection, aside from being able to use a slightly less complex PCIe switch internally? But then this would blow the theory that the 2C versions are just using harvested dies...