Short Pipeline is better. Long pipeline is better.

Discussion in 'General Mac Discussion' started by acj, Aug 2, 2003.

  1. acj
    macrumors 6502

    Joined:
    Feb 3, 2003
    #1
    How do you know what to believe, and when?

    The G4's 4, and later 7 stage pipeline was always advertised as a strong point, compared to the pentium 4, but now the G5 has a 16 stage integer pipeline and a 21 state FP pipeline, and up to 25 stages for the Velocity Engine.

    Now they mention over 200 simultanious instructions compared to P4's 126, but they never mentioned this with the G4's measly 16. Obviously the P4 wasn't over 7 times faster than the G4.

    It really seems that nothing really matters and if speed is all that's important to you, you need to compare computers side by side with what YOU do and see what's best for yourself.
     
  2. macrumors 604

    iJon

    Joined:
    Feb 7, 2002
    #2
    ah come on, we all know this. the whole mhz myth was just supposed to be music to customers ears who wanted ac omputer for speed when apple was getting their butts kicked by intel and amd.

    iJon
     
  3. macrumors 68040

    Powerbook G5

    Joined:
    Jun 23, 2003
    Location:
    St Augustine, FL
    #3
    Exactly, all you need to know is that the G5 is one higher than the P4, why would you want a mere 4 when you can have a 5? :D
     
  4. Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #4
    When the 604 and the P2 went head-to-head, the 604 was better by quite a margin. When the P3 arrived, the margin narrowed for the 604e.

    The G4 is just a poorly designed economy processor with an above average SIMD unit.

    A short pipeline is better when the processor doesn't guess correctly because it has to unload everything and start over. If it guesses correctly, obviously a long pipeline is going to help because all the instructions/data are available and can keep the processor going at full pace.
     
  5. macrumors 6502a

    simX

    Joined:
    May 28, 2002
    Location:
    Bay Area, CA
    #5
    Re: Short Pipeline is better. Long pipeline is better.

    It all really depends, and your last statement is pretty much right on target.

    From my (very limited) understanding, longer pipelines allow you to do more instructions per clock cycle, but they are a drawback when "bubbles" appear in the pipeline. "Bubbles" are like when a certain instruction requires the results of another instruction, so it has to wait for that other instruction to finish... so a long pipeline means that the bubble has a bigger effect on the efficiency of that clock cycle.

    But like you said, these drawbacks can be overcome by other design considerations, so it's best just to compare real-world performance in applications you use.
     
  6. macrumors 604

    MrMacMan

    Joined:
    Jul 4, 2001
    Location:
    1 Block away from NYC.
    #6
    Yes, this is why shorter or longer pipeline can't be calculated properly.
     
  7. macrumors 68040

    Powerbook G5

    Joined:
    Jun 23, 2003
    Location:
    St Augustine, FL
    #7
    I wonder how the branch prediction on the G5 will affect it's longer pipeline, I know Steve and the IBM guy both said it predicts correctly a good 90% of the time or so, but for that 1 in 10, that error is going to be felt more on a longer pipeline, isn't it?
     
  8. macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #8
    Pipeline depth IS important, but it's not the final word in performance. When balanced out by a good cache, good out of order execution (oooe), and good branch prediction (like a P4 or G5), lengthening the pipeline is an effective way of increasing performance (by raising the clock frequency). In a processor like the G4, with little to no oooe (because embedded programs tend to be hand scheduled, and oooe increases power consumption), and only mediocre branch prediction (because of the short pipeline), lengthening the pipeline would probably have been a bad idea. It would have DEFINITELY been a bad idea for its target market (high end embedded), which is notoriously latency and power conscious. The G4 and G4+ served their intended purpose quite well, although the massively delayed transition to .13 micron, and the lack of an on chip memory controller are beginning to hamper them. The fact that they made pretty decent P3 killers was (mostly) just an added bonus.
    The G5, on the other hand, seems squarely targetted at Xeons and Opterons (and to a lesser extent P4/AthlonXP/Athlon64), which is just about perfect for Apple's purposes. It's designed in a fairly similar way to them, in certain respects. It has a long pipeline, with extensive oooe, and large caches. This allows it to scale much higher than the G4, and makes it better suited to running poorly optimized code (which is what most code is), The downside is that it has significantly higher power consumption and manufacturing cost than a G4 made on the same manufacturing process.
     
  9. macrumors 6502a

    Fender2112

    Joined:
    Aug 11, 2002
    Location:
    Charlotte, NC
    #9
    One analogy that stuck with me described the the 970 like this: The P4 has long and narrow pipeline. The G4 has a short and wide pipeline. By comparison, the G5 (970) has long and wide pipeline. I don't know what this means in terms of stages or in flight instructions. This description gave me a mental image that makes the G5 seem like the best of both designs.
     
  10. acj
    thread starter macrumors 6502

    Joined:
    Feb 3, 2003
    #10
    Fender:

    I think that's fairly accurate. Time will tell. Most of us haven't actually used a G5.
     
  11. macrumors 68030

    Catfish_Man

    Joined:
    Sep 13, 2001
    Location:
    Portland, OR
    #11
    This is true, but a number of compromises elsewhere in the design were made to achieve this. Tracking the execution of 200+ instructions would be prohibitively difficult, so they divided them into groups of 5 and tracked 40 groups instead. This allows them to keep the complexity down to a manageable level, but adds a number of restrictions to how instructions can be dispatched and retired. Overall, I think this was a good tradeoff (2 integer, 2 floating point, 2 load store, and 4 vector, with a long pipeline, is very impressive), but it's going to be a bitch for the compiler writers.
     
  12. macrumors 68000

    Mav451

    Joined:
    Jul 1, 2003
    Location:
    Maryland
    #12
    hey fender: so what kind of pipeline does the AthlonXP and Opteron have?

    I'm just wondering since i went to that Ars Technica site and didn't understand a single word of what they said :(
     
  13. macrumors 6502a

    Fender2112

    Joined:
    Aug 11, 2002
    Location:
    Charlotte, NC
    #13
    Those are clogged pipelines. You want to barrow some of my Draino? :) Seriouly though, I don't know. This was analogy that help explain the difference between PPC and x86. That Ars Technica artical is bit above my head, but I did follow the gist of it.
     

Share This Page