Originally posted by barkmonster
I read something about the SPEC benchmarks before.
I can't remember where or the exact wording of what I read but I read that SPEC benchmarks are extremely suspect.
Yeah, and I'll bet you anything that what you read came from a poster who was a Mac zealot.
A few reasons spring to mind.
Compiler Optimisation - obviously this is already mentioned.
SPEC is a test designed to measure system speed across platforms. It is written in a standard language - C. It is actually not a single test, but many smaller, separate tests comprising one result. The tests are designed to accurately reflect the real-world tasks a CPU is required to accomplish.
System vendors compile the SPEC test suite with whichever compiler they choose. They are not allowed to modify the test suite. They then run the test suite, and the test suite outputs the result. That's all there is to it. The result is a function of CPU speed and compiler efficiency. It provides the most accurate reflection yet devised of application speed across platforms.
SPEC is the most widely-accepted benchmark there is, adopted by EVERY major computer company EXCEPT Apple - HP, Sun, IBM, Compaq, Dell, Intel, SGI - you name it. (Why Apple refuses to publish SPEC scores and instead chooses to publish ludicrous Photoshop "benchmarks" is anyone's guess, but the real reason - that the G4 performs downright embarrassingly in SPEC_CPU2000 - seems pretty obvious to me.)
Perfect Instructions - This is the big one, this is the one that means that the branch predictor handling that 19 stage pipeline on a pentium 4 NEVER misses an instruction! Hardly realworld.
But it IS real-world, because it is a testament to the EXCELLENT compiler Intel has written for the P4. Intel was able to write a compiler that takes an ordinary task and lays it out for the execution unit just perfectly.
By contrast, the PPC compiler used in the SPEC test suites is complete sh*te. You say this isn't fair, but it IS fair, because real-world performance is a function of not only theoretical CPU performance but the code generation of the compiler. The only way this wouldn't be the case is if all developers wrote their software in assembly only. So Apple can go on all it wants about its gigaflops and its "Pentium 4-crushing" performance and so on, but the fact will remain that all that is little more than marketing drivel describing the peak theoretical performance that is not even close to being accurate in reality.
No, the SPEC test (on the benchmarks I've seen) does not account for AltiVec. This is because the SPEC test suite is written in platform-independent code that is not optimized for ANY individual processor. Intel is lucky that is has a great compiler that can optimize for the P4's SIMD units automatically. Once Apple incorporates a compiler capable of auto-vectorizing into OS X (presumably GCC 3.x), the FP side of the results will improve quite a bit. But that hasn't happened yet. (They will need a 4-fold FP performance improvement to even come close to the 2.5GHz P4.) And keep in mind AltiVec is not double-precision, or at least it won't be until the G5+ gets here.
Small Instructions - Helpfully for those CPUs with tiny amounts of cache RAM, the instructions used in the SPEC benchmark are designed to fit in those small amounts of cache, eliminating the bottleneck of RAM while giving another misleadingly high score to inefficiently designed cpus.
This is simply not true. The whole point of SPEC is to accurately measure real-world performance across platforms. If you've got a problem with the way the G4 performs and you think the Pentium 4's results are inaccurate, then please explain why the 1GHz G4 gets beaten badly not just by the Pentium 4, but also by such chips as:
- The 400MHz MIPS R12000
- The 500MHz MIPS R14000
- The 750MHz PA-8700
- The 833MHz Alpha 21264B
All of which have huge secondary and tertiary caches.
This means in effect that SPEC benchmarks are about as realworld as throwing 2 computers off a bridge and the one that sinks first is the winner.
*cough*
In reality the Pentium 4 is held back by the lack of L3 cache, the lack of a barrel shifter (reverses numbers in 1 instruction, even the 386 had one!), the huge pipeline means that even with a highly efficient branch predictor the pentium 4 is spending more time waiting for instructions that executing them. I've generalised on this last paragraph by the way, some of the info is from emulators.com
You almost sound as if you actually know what you're talking about here.
Alex