Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Phinius
...a topend frequency of 2 GHz on a .13-micron process would be a patheticly poor result from adding all those extra pipeline stages.

If you're comparing to the Pentium, you're ignoring the difference in instruction set. Clock rate is determined not just by the number of pipe stages but also by how much work is done at each stage.

2GHz is already faster than the Power4 can clock, and it's the first "pressing". We'll probably see some speed bumps as IBM tunes their process and boosts their yields of the faster parts (didn't I hear rumors of a 2.8GHz part in testing?).

I'd guess we move to the 980 before the last pico-seconds are squeezed out of the 970 though. Then the 9x0 can keep abreast of the Power(x-3) developments.

Again referencing Ars Technica (without rereading the article so I may be wrong), their take was that the extra pipe-stages were inserted to allow the Alti-vec block to be bolted on to a remote area of the die. If I remember, they placed the extra stages in the decode, dispatch and completion phases of the pipe not in the execution phase...
 
Re: Re: i want... no, NEED more than 25GHz!!!

Originally posted by soggywulf
Now THAT is a kick ass idea! Here's something related that I've always thought would be incredibly cool. You know how there are some small parts of songs that sound really good to you, like they're hardwired to your soul? Such samples might come from a variety of music, and be unrelated to genre or band/composer. What if you could play those snippets, and your 50 THz machine would be able to search the "iTMS" to find all songs which have large portions of emotionally similar music?

Actually the idea has already been done and researched by MIT I believe. A PC version was suppose to come out sometime this year, but who knows.
 
25ghz....seriously a practical application for even greater power will be the ships that will take us to Mars and beyond. Would you trust your life to a Wintel based ship? hehe
 
Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Phinius
Yes, but if you increase the pipeline stages on a given design by 1/3 then it should top out at a much higher frequency. The POWER4 has 12 integer stages and the 970 has 16 stages. Yet the Power4 is at 1.7 GHz and the 970 is only at 2 GHz. IBM obviously increased the pipeline stages for the 970 to raise the topend frequency. How high of a frequency it can achieve is unknown at this time, but 2 GHz is very low for 16 pipeline stages. The AMD Athlon XP is at about 2 GHz and yet it only has 10-12 pipeline stages and as I said the 74557 G4 is expected to hit 1.8 GHz and it has a measly 7 pipeline stages.

The 970 is also designed to achieve high clock rates. That's why IBM put 16 integer stages and 21 stages for floating point. You don't honestly believe that the IBM design is that horribly inefficient to not get significantly beyond 2 GHz on a .13-micron process do you?

The POWER architecture is designed to execute many instructions in every clock cycle, but with the added pipeline stages for the 970 it is also designed for much higher frequencies. That's obviously why IBM added the additional pipeline stages and a topend frequency of 2 GHz on a .13-micron process would be a patheticly poor result from adding all those extra pipeline stages.

Your still using the most obvious (and most frequently used) answers (manufacturing capabilities, deeper pipeline) to address a question that delves far deeper than a simple explanation (deeper pipeline) can provide.

Let me ask you this question, a Pentium 2 has a 12 stage pipeline, a Alpha 21264 has a 7 stage pipeline, both cpus used .35 micron process and yet the Alpha was able to scale up to 600 MHz while the P2 was only able to scale to 300 MHz, how is this possible considering the Pentium 2's pipeline is nearly twice as long as the Alpha's?

Apparently, the answer lies beyond simply increasing the number of stages, but also to the design principles. Look back at my original list on all the aspects of a CPU that can affect clock rate, it turns out that our lovely Alpha engineers have covered nearly every one of those aspects, the cpu itself is highly paralleled and stages are kept extremely simple, this in turn, along with the Alpha chips strong ALU, and Decoders keep the distances betweens "pulses" extremely close. Cache and buffers will no doubt pose a problem as is true with all cpus of this class but apparently, it doesn't keep the Alpha from scaling. Latencies are kept at a minimal, ditto for power dissapation. No wonder the Alpha was able to scale twice as high as the Pentium 2, despite having a much shorter pipeline, nearly every aspect that affects scaling is far superior on the Alpha chip than the PII.

So how high would our Alpha cpu scale using the current .13 micron process? The general rule of process shrinks is a 50% increase in clock rate so moving from .35 micron process to .13 micron process (3 generations), our Alpha processor with it's "short" 7 stage pipeline will be at a amazing 2.025 GHz! Thats nearly as high as a Athlon which, with it's much longer 12 stage pipeline, has basically reached it's limit at 2.2 GHz and that's not assuming any improvements in Alpha architecture over all this time.

Apparently, as I have just proven with our Alpha example, shallower pipelines doesn't necessarily mean that a cpu can't scale to higher clock rates and deeper pipelines doesn't necessarily mean that a CPU can scale. It's all about design principles and expertise. Considering the Power4++ basically reached it's limit at 1.7 GHz, I wouldn't be surprised if a PPC970, with it's slightly longer pipeline would only be capable of scaling to around 2 GHz under the current micron process.
 
I'm with the gaming comment.

I belive OLEDs will finally yield VR goggles, and I'm looking forward to crapping my pants at hyper-realistic, finely tuned real-time physics involved, superior enemy AI utilizing Doom V in VR :eek:

Oh....yeah.....

Aside from that, the advances in technology and medicine that will come with farms utilizing that kind of power will be amazing.

The weather bit is BS, I don't care how much computational power you put forth, you cannot take all factors into account as is necessary to predict weather that accurately unless you're omnitient. Unless you don't believe that humans affect the weather, in which case you no longer have to be in the business of predicting human behaviour.
 
Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Analog Kid
If you're comparing to the Pentium, you're ignoring the difference in instruction set. Clock rate is determined not just by the number of pipe stages but also by how much work is done at each stage.

2GHz is already faster than the Power4 can clock, and it's the first "pressing". We'll probably see some speed bumps as IBM tunes their process and boosts their yields of the faster parts (didn't I hear rumors of a 2.8GHz part in testing?).


IBM's German website already announced a upcoming blade server that will use 970 processors running from 1.8-2.5 GHz. So, the 970 will hit as least 2.5 GHz. It's also likely that a 2.1-2.5 GHz G5 would use 533 MHz DDR-II memory since Samsung already has it in production and Micron will manufacture it starting in late 2003. At 2.6 GHz the G5 will probably use 666 MHz DDR-II memory since 4X 666=2,664, which translates into 2.6 GHz processor using a 1.3 GHz bus speed.

Again referencing Ars Technica (without rereading the article so I may be wrong), their take was that the extra pipe-stages were inserted to allow the Alti-vec block to be bolted on to a remote area of the die. If I remember, they placed the extra stages in the decode, dispatch and completion phases of the pipe not in the execution phase...

If that is true then the POWER4+ processors should achieve greater than 1.7 GHz, because IBM already stated that the 970 will hit at least 2.5 GHz and the 970 is derived from the POWER4.
 
Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Regarding "pulses":
As most of you have probably guessed by now, the distance between two pulse represents one clock cycle, therefore, with a given "length", more pulses will mean more clock cycles.
 
Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Cubeboy
Let me ask you this question, a Pentium 2 has a 12 stage pipeline, a Alpha 21264 has a 7 stage pipeline, both cpus used .35 micron process and yet the Alpha was able to scale up to 600 MHz while the P2 was only able to scale to 300 MHz, how is this possible considering the Pentium 2's pipeline is nearly twice as long as the Alpha's?

The answer is that the Alpha 21264 and the Pentium 2 are completely different architectures. However, the POWER4 and 970 are not completely different chip architectures. The 970 does not have a large external L3 cache, unlike the POWER4 and the 970 has a L2 cache about a third the size of the POWER4. The 970 also does not have a chip interconnect fabric and an additional cpu and the POWER4 does. The 970 does have Altivec, but the core design of the 970 and POWER4 are basically the same.

IBM had made available a press release on their German website that stated a upcoming IBM blade server will use 1.8-2.5 GHz 970 processors. That officially confirms that the 970 will hit at least 2.5 GHz. An IBM spokesperson also stated that the 2.5 GHz speed will be on the .13-micron process.

Considering the Power4++ basically reached it's limit at 1.7 GHz, I wouldn't be surprised if a PPC970, with it's slightly longer pipeline would only be capable of 2-2.2 GHz.

Your assuming that the POWER4+ chip has topped out at 1.7 GHz, whereas IBM has made no such statement and your also stating that adding 1/3 more pipeline stages is 'slightly longer.' 1/3 is only slightly greater?

How can you make the claim that the 970 pipeline at 1/3 greater length than the POWER4 is only 'slightly' longer than the POWER4 pipeline? 1/3 greater seems to be a rather large step up in size to me, perhaps if Intel increases the 20 stage pipeline of the Pentium 4 by 1/3 and releases a new Pentium processor with 26 pipeline stages, then you will also state that it is has only a 'slightly' longer pipeline that the previous Pentium processor.
 
Re: Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Phinius
If that is true then the POWER4+ processors should achieve greater than 1.7 GHz, because IBM already stated that the 970 will hit at least 2.5 GHz and the 970 is derived from the POWER4. [/B]

Here's the Ars Technica link:
http://arstechnica.com/cpu/03q1/ppc970/ppc970-12.html

Note that they can't do much more than speculate either but it seems like they've put a lot of thought and research into it so I tend to trust their intuition.

Remember that the Power4 and 970 are being built on different processes. The Power4 is built on a more robust process for higher reliability, so even if it were identical to the 970 it would clock slower...
 
Re: Re: Re: i want... no, NEED more than 25GHz!!!

Originally posted by ZildjianKX
Actually the idea has already been done and researched by MIT I believe. A PC version was suppose to come out sometime this year, but who knows.

I'll believe that when I see it. ;) The marketing descriptions of some of these lab projects can grossly distort what they actually do.
 
Re: Re: 25 GHz?

Originally posted by sedarby
The government will be changing their attitude when performance eclipses anything an Intel or compatible based PC can accomplish.

Hmmm... Weren't we saying that about the 601? :)
 
Re: i want... no, NEED more than 25GHz!!!

Originally posted by asim
and some quality virtual-reality porn will require a quad GX (the "G ten") 42GHz with a 3d hologram projector... hopefully it will be ready by the time i find someone, get married, have kids, and get left by her. one can only hope...

LOL. Better skip the wife and kids if you want to afford the VR. :D
 
Originally posted by BaghdadBob
I belive OLEDs will finally yield VR goggles, and I'm looking forward to crapping my pants at hyper-realistic, finely tuned real-time physics involved, superior enemy AI utilizing Doom V in VR :eek:

Imagine hemishperical domes over each eye, to give you binocular depth perception along with peripheral vision. 3000x3000 res on each screen. Along with a lightweight gyro system to let the computer know which way your head is pointing. Now THAT would be a flight sim.
 
Re: Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Phinius
The answer is that the Alpha 21264 and the Pentium 2 are completely different architectures. However, the POWER4 and 970 are not completely different chip architectures. The 970 does not have a large external L3 cache, unlike the POWER4 and the 970 has a L2 cache about a third the size of the POWER4. The 970 also does not have a chip interconnect fabric and an additional cpu and the POWER4 does. The 970 does have Altivec, but the core design of the 970 and POWER4 are basically the same.

IBM had made available a press release on their German website that stated a upcoming IBM blade server will use 1.8-2.5 GHz 970 processors. That officially confirms that the 970 will hit at least 2.5 GHz. An IBM spokesperson also stated that the 2.5 GHz speed will be on the .13-micron process.

Your assuming that the POWER4+ chip has topped out at 1.7 GHz, whereas IBM has made no such statement and your also stating that adding 1/3 more pipeline stages is 'slightly longer.' 1/3 is only slightly greater?

How can you make the claim that the 970 pipeline at 1/3 greater length than the POWER4 is only 'slightly' longer than the POWER4 pipeline? 1/3 greater seems to be a rather large step up in size to me, perhaps if Intel increases the 20 stage pipeline of the Pentium 4 by 1/3 and releases a new Pentium processor with 26 pipeline stages, then you will also state that it is has only a 'slightly' longer pipeline that the previous Pentium processor.

You've just proved my point, the reason the Alpha scales higher is because they are different architectures in the sense that the way the Alpha was designed allowed it scale far higher than the Pentium 2 despite having a much shorter pipeline.

Apparently, having a longer pipeline which was the entire basis of your argument, doesn't mean that a processor will be able to scale higher nor does having a shorter pipeline mean a processor won't be able to scale higher. The entire point of my previous two post.

IBM has always released it's fastest Power4 possible and the current 1.7 GHz model has been around for quite a while. It's pretty reasonable to assume that it's the fastest Power4 possible with the current architecture and micron process.

The PPC970 has a 16 stage pipeline, the Power 4 has a 12 stage pipeline, putting this into context as well as considering what each stage actually does, I don't consider it very significant to scaling.

Steve Jobs has said that the PPC970 will scale to 3 GHz in 12 month(?) that makes sense considering moving to .09 micron process from the current .13 2 GHz will result in exactly 3 GHz.

So far, you've been unable to substantiate or prove any of your claims about pipelining and now your moving to articles. Do you care to provide a link?
 
The Reality

-The 970 is a superset of the Power4 core, not a subset. It contains additional instructions to allow for the quick migration of 32-bit OS's to 64-bits (namely, mapping virtual memory to physical memory, now that the processor can handle more RAM directly.) It contains an SIMD unit that Power4 does not. It can achieve higher frequencies. It contains a different Apple proprietary processor "coherent" interconnect technology.

- In a SMP environment the dual 970's cache technology behaves similarly to a dual core Power4. Namely, they combine each others L1 and L2 cache's, in effect equaling the size of the Power4 L1 & L2 Caches. (Opteron does the same thing.)

- The RAM speed does not delay the rate at which the Processor speed can be advanced. Right now, the RAM effectively runs at 800MHz and the 2GHz G5 at 1GHz. The bus is switched and gives every component a dedicated pipe. So the RAM can continue to run at 800Mhz while the G5's reach an FSB of 1.6GHz (3.2GHZ G5). Of course, it is always nicer to have ever faster RAM, but the speed of the system will still improve with a faster G5, especially when working with data in the L1 and L2 cache. Apple could also employ other techniques such as increasing the caches and adding superfast large L3 caches that would prefetch data from RAM. The L2 Catch also has a pre-fetch mechanism that reduces the effect of slower RAM.

- The 970 could easily go to 3.2 GHz and probably within 6 months, Steve was being conservative, can you blame him? They are already able to overclock some of the 970's coming off of the production line at 3.2GHZ. IBM always over-engineers.

- FP instructions have always been 64-bit, so OS X can already take advantage of the G5 64-bit FP units.

- Using the 970's new instructions for porting OS's to 64-bit, Panther will be able to easily address the entire 64-bit (42-bit address space) available to it. Jaguar 10.2.7 gives each processor 4GB each in a dual system, in the meantime.

- The 980's main design differences will be: 1. Shorter wire lengths within the core to increase speeds. 2. Hyperthreading (unlike Intel, the core can actually benefit from hyperthreading. 3. Better core power management, reducing power dynamically to less frequently used portions of the core. 4. .09 process, etc.
 
Re: Re: Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Analog Kid
Here's the Ars Technica link:
http://arstechnica.com/cpu/03q1/ppc970/ppc970-12.html

Note that they can't do much more than speculate either but it seems like they've put a lot of thought and research into it so I tend to trust their intuition.

Remember that the Power4 and 970 are being built on different processes. The Power4 is built on a more robust process for higher reliability, so even if it were identical to the 970 it would clock slower...

According to a article written by David Wang at RealWorldTech.com, a IBM spokesperson informed him that the 970 and Power4 have the same thickness of traces. IBM did not use a more robust process for the Power4 than the 970. So, the argument for a slower or more reliable POWER4 than a 970 due to a different process is mute.

I would trust that the IBM German press release had accurate information and that the 970 will reach 2.5 GHz on a .13-micron process.
 
NOT Unbelievable

In 1989, I bought a Mac IIcx. I was told it was state of the art. It ran at 16 MHz. I was told that if I was going to do graphics, I would need a bigger hard drive. Therefore, I bought a 40 MB.

In 1997, I bought a PowerMac 9600. It ran at 200 MHz (pre-G3). (16 to 200--or 12x-- in 8 years.) This time, it came with a 4 GB hard drive (10x).

In 2001, I bought a Quicksilver, operating with a blistering 867 MHz G4 processor. (>4x in 4 years or 54x in 12 years.) To boot, it came with a 60 GB hard drive. (15x larger in only 4 years, or a whopping 150x larger in 12!)

Oh, and after each purchase, I thought, "I'll NEVER fill THAT hard drive." Those who think that 10 or 12 or 25 GHz is too much will be amazed at the minimum requirements of the software when those machines are available.
 
Re: The Reality

Originally posted by stingerman


- The 970 could easily go to 3.2 GHz and probably within 6 months, Steve was being conservative, can you blame him? They are already able to overclock some of the 970's coming off of the production line at 3.2GHZ. IBM always over-engineers.

That would mean the 970 would achieve the same topend frequency as the Pentium 4 on the .13-micron process size, which is highly unlikely.



- The 980's main design differences will be: 2. Hyperthreading (unlike Intel, the core can actually benefit from hyperthreading.

You mean simultaneous multithreading because Hyperthreading is Intel's trademarked name. IBM has stated that their version of simultaneous multithreading for the upcoming Power5 will perform like two processors going full throttle. As of yet Intel has only been able to achieve up to a 30% speed increase from using HyperThreading.
 
Re: Re: Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be possible

Originally posted by Cubeboy

IBM has always released it's fastest Power4 possible and the current 1.7 GHz model has been around for quite a while. It's pretty reasonable to assume that it's the fastest Power4 possible with the current architecture and micron process.

The 1.7 GHz POWER4+ has only been out a few months and previous to that was a 1.4 GHz POWER4+ processor. So, IBM releases the fastest processor that they can at any given time, but that does not mean the it's the fastest POWER4+ possible with the current architecture and process size.

Steve Jobs has said that the PPC970 will scale to 3 GHz in 12 month(?) that makes sense considering moving to .09 micron process from the current .13 2 GHz will result in exactly 3 GHz.

The first 9XX processors made on a .09-micron process will almost certainly not be the fastest that IBM will eventually obtain on that process size. It takes time for manufacturing to increase the frequency of a processor on a given process size, it doesn't happen right from the get go.

An example of that would be Intel's upcoming Prescott Pentium processors which are expected to start at about 3.4 GHz and eventually it will peak at about 5 GHz over a year later.

Also the 970FX processor from IBM did not start out with a chip available at the topend speed. IBM moved it up to that speed about a year later.

Motorola is expected to have 7457 processors for sales this month and there will probably be availability for chips that run at the top speed of 1.3 GHz that is listed on Motorola's website. However, Motorola has been increasing the frequency beyond what's listed on the companies website for Apple's use. An internal Motorola document stated that the 7457 will peak at 1.8 GHz, yet don't expect that to appear immediately in a Apple computer, it will probably take some time for Motorola to get the chip up to that speed.

So far, you've been unable to substantiate or prove any of your claims about pipelining and now your moving to articles. Do you care to provide a link?

I cannot provide the link to the IBM German website press release about the 1.8-2.5 GHz 970 processors since IBM took the link down and it wouldn't do you any good if you don't understand written German.
 
Re: Re: The Reality

Originally posted by Phinius
That would mean the 970 would achieve the same topend frequency as the Pentium 4 on the .13-micron process size, which is highly unlikely.





You mean simultaneous multithreading because Hyperthreading is Intel's trademarked name. IBM has stated that their version of simultaneous multithreading for the upcoming Power5 will perform like two processors going full throttle. As of yet Intel has only been able to achieve up to a 30% speed increase from using HyperThreading.


Yes the 970 will achieve 3.2 easily before the 980 is released.

Intel's hyperthreading ideally is 30% faster but it also causes 30% speed reductions as well. It depends on the software. And most Wintel software does not behave well. That is why hyperthreading is usually turned off during benchmarks as it usually slows down the system. As far as its name, I prefer to use the same terminology to avoid confusion, the very thing the marketers try to create by renaming each others technologies. Actually an old IBM trick.

Hyperthreading type technology will work better on processors with a greater deal of instruction level parallelism (ILP). The whole purpose of hyperthreading is to convert threading into ILP. The theory is two threads can fill up the parallelism better than one thread. That is why the Intel P4 hyperthreading fails, the P4 is very low on the ILP, the very thing necessary for hyperthreading to work. SO instead of hyperthreading speeding up a P4, it creates another bottleneck in a lot of cases and slows down the whole system as one thread in a wait state hogs up a functional unit and all the instructions pile up behind it.

The Power4 and Power5 cores are ideally suited to hyperthreading will will in many cases double the number of instruction being processed in the same clock cycle!
 
G5 Cooling

How about this for some speculation: The G5 case has also been designed to work with liquid cooling systems. Take a close look at the processor cooling fins. It appears that Apple already includes an intake and outtake tube as part of the design. Of course, I haven't seen one in person, only via analyzing the pictures. But it should be easiy to retrofit the very clean design with liquid cooling.
 
Re: The Reality

Originally posted by stingerman

(snip)
Jaguar 10.2.7 gives each processor 4GB each in a dual system...
(snip)

Pardon my ignorance, can someone unconfuse me here? I thought that memory space was shared between processors in an SMP system. Otherwise, wouldn't two threads running within the same process have trouble sharing data if they were assigned different processors? Or is such an assignment not allowed?
 
Re: Re: Re: Re: Re: Re: Judging by the 970 pipeline stages these speeds could be poss

Originally posted by Cubeboy
Apparently, having a longer pipeline which was the entire basis of your argument, doesn't mean that a processor will be able to scale higher nor does having a shorter pipeline mean a processor won't be able to scale higher. The entire point of my previous two post.



I think his point is that given the same architecture, adding more (well designed and placed) pipe stages allows higher clock rates-- which is true.

Less logic between registers takes less time to propogate through. This is one way in which Intel has scaled the Pentium performance over the years.

If the 970 and the Power4 share the same basic architecture, which they do by all accounts, and the pipe stages are placed with the goal of increasing granularity in the critical timing paths (which I'm not sure they are) then you should be able to increase the clock rate of the chip.

This is not the only way to boost the clock rate, and it won't work without limitation, but staging the critical timing path does help.

Given what I've read about the architecture though, I don't know that boosting the clock rate over the Power4 was the reason for the added stages. It would seem like a lot of trouble to go through if your design goal was to save effort on the architecture development...

Based on where the stages were added, it looks as though placing the Alti-vec block in a sub-optimal location on the die actually added a timing constraint in communicating the data to and from the main execution core. The stages may not have been added to boost the clock rate, but rather to keep the stages balanced so the clock rate didn't have to decrease...
 
Hmmmmmm....................?

Originally posted by Veldek
I'm quite sure that I read some days ago, that IBM has a 970 working at 3.2GHz dissipating 82W, which is as much as the Intel chips at the same frequency. So it doesn't seem impossible for them to reach the 3Ghz barrier with a 970.

Anyway, as IBM is working on the Power5 (the 980 will be a derivative of it, the G4 has nothing to do with either of them) and on the Power5+, I think we may expect a lot in the near future.

The P4 @ 3.2Ghz dissipates up to 100W, and can be used as a second oven.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.