The POWER5 goes into systems (envisioned) to have way more memory bandwidth than a desktop system.
The POWER5 sells for a lot more, and is allowed to suck a lot more power for the intended use.
As I said previously, the benefit of SMT is not great on Power PC, thus, this next point really drives it home:
Thanks for the links. I love Stokes' writing on CPUs. Actually after reading this article I'm more convinced now that both dual core and SMT will be on the next major revision and here's why.
The rest of the big changes to the core are related to the addition of SMT, which IBM claims increased the size of each core by 24%. (This increase in die size is another reason why an SMT-capable POWER5 derivative for the Mac is a ways off.)
That automatically rules out the 154mm 970MP. Adding SMT puts the 970MP at nigh 185mm. No go.
Since increasing execution unit utilization is one of the main goals of SMT, the increase in issue bandwidth utilization as described above is going to be key, especially for the POWER5. I say this because in my first articles on the G5 I suggested that the POWER4/970's group-based dispatch scheme and issue queue configuration probably constrains issue flexibility and therefore execution unit utilization in some peculiar ways under certain worst-case scenarios (i.e., one execution unit of a pair is overloaded while the other is starved, due to a combination of poor instruction ordering on the part of the programmer/compiler and the group dispatch limitations)
Another gotcha in the 970. I expect IBM to rectify this not that it's really bad but you don't want one execution unit starved while the other is gorged. I'm figuring that IBM fixing this is a priority.
At this point, I could talk about the need for SMT in an Apple system, but I'll just leave off that sort of commentary for now and observe only that Apple's long-standing and ongoing affinity for SMP designs has resulted in two things: 1) a huge potential for wasted execution resources on the current crop of non-SMT-capable G5s and 2) a body of natively-developed and -ported applications that have been subjected to years of pressure to use multithreading wherever possible in order to wring the best performance out of Apple hardware. I think both of these factors will converge to make a SMT a significant improvement for the Mac platform.
Makes sense. Apple has been actively promoting the use of threading(more than one session at WWDC 2004 on threading). While the 970x processors do not efficiently feed the execution units I'm sure this will not happen with the mythical G6.
I see the plans going like this.
970MP systems ship Q2 2005. There are two refreshes.
IBM works on POWER5 derivative which replaces current execution units in 970x CPU, is also a dual core with SMT and I'm guessing adds an ondie memory controller. This is all at 65nm so we're about 100-120mm squared for the whole shebang.
Apple wants this because
1. SMT will help in Xserve systems
2. Dual Cores are standard now.
3. Ondie Mem controllers allow them to reduce the complexity of the system controllers. 980MP systems will have hypertransport 2.0 links betwen them.
4. Thread prioritization is right up Apples alley being heavy in multi media.
I won't be suprised to see Apple really hammer threading even more for WWDC 2005. It's amazing but large apps like Maya still don't support SMP.
I'm pretty jazzed about the POWER5 queing 10 instructions and dispatching 2 per cycle. IBM need only add this same function to the 980 and get the Altivec unit back to G4+ levels and we'll be sitting pretty by 2006.