Pretty much confirmed: Pixar to switch

Cubeboy · Aug 1, 2003

Originally posted by JBracy
Then why did SGI stop selling Intel/Linux based workstations and just released new MIPS/IRIX based ones? The SGI Tezro

If they don't want new customers for their MIPS/IRIX platform then why spend the R&D on new systems?

If they want to move them to a new platform then they need to actually sell a platform to move them to!

It's like saying that Apple doesn't want new customers they only want to support the ones they already have!

SGI stopped selling Intel/Linux based workstations because sales were very poor, simple as that. SGI has had MIPS based workstations before, they just made the mistake of trying to impose a Intel based solution on their customers too quickly and too soon (and this goes back to my original point). Apparently the customers weren't very happy with this, and responded by moving to other vendors as a result.

If you look at the specifications of the Tecra, theirs really nothing new about it, it still uses the same MIPS R-16K processor (yawn), DDR memory, SCSI hard drives. It's certainly not going to get SGI any new customers considering the performance and price. I'd say it's just SGI's attempt to woo back some of it's old customers.

Cubeboy · Aug 1, 2003

Re: Re: Re: Re: Re: Re: Re: Renderman

Originally posted by Tychay
Not true. Show me that "using the best compiler" is "fair"--most people think that doing so is part of what is wrong with benchmarking in general. Show me the evidence that any version of GCC, is better on PPC than x86--most experts think the reverse is true and that, at best, GCC3.3 is a little closer to platform parity than earlier versions. Show me the "ton of tweaks" Apple used--there were only two, both may have been necessary to get the thing to compile at all.

You are confusing IBMs intention to submit tweaks to GCC (most of which involve the fact that the 970 has an interesting way of grouping code that is unique to that processor and a rumor that they may submit autovectorization a la ICC) none of which were actually submitted and none of which would necessarily be accepted due to GCC's design goals (portability), and Apple's misguided intent to "normalize out" the compiler (whatever that means).

The only thing underhanded here is how so many news sites spread such ill-informed B.S. about Apple "juicing" their benchmark that now it is accepted conventional wisdom.

Actually, we already know that the MD file used was modified over default (which were wrong). So apparently, the "no tweaks over default during testing" statement is false. This leads me to doubt that their weren't any tweaks, especially considering Veritest specifically listed the version of GCC as a specific build as opposed to just GCC.

From what I understand, IBM and Apple spent a considerable amount of time optimizing the scheduler for the PPC970.

"The gcc scheduler is not really designed ideally for a processor like the 970 and the Power4 and others, and that's a lot of what the IBM and Apple teams have worked on".

GCC is not able to schedule for the Pentium 4 at all, this is really quite important mind you considering nearly all floating point code needs to be well scheduled.

Many of the commands in GCC such as -march (which was used in the test) are normalized towards RISC chips, this means that the Pentium 4 (and most other x86 cpus for that matter), with it's "measily" eight registers, well probably be better off without them. I know this for a fact with -march=pentium 4 which is normalized for a 32 register RISC chip.

Do you know exactly how much autovectorization improves Pentium 4 ICC performance over x87 only code in SPECfp?

Can you actually give *ANY* evidence of exactly *HOW* Intel might cheat with ICC?

Cubeboy · Aug 1, 2003

Re: Re: Re: Re: RenderMan Results?

Originally posted by tychay
Standard practice is to report the best result using any compiler/tweak/OS necessary to achieve it--you cannot modify the hardware though since SPEC is a system test, not a CPU one.

Apple's result should be not listed, or listed with an asterix since the actual number would be much higher.

terry

SPEC's benchmark suite consists of lots of kernels and all the programs have a memory footprint of between 100 and 200 MB so it's more about cpu/memory subsystem/compiler than anything else. Base score allows for only a single set of flags and limits the number of flags to four. Assertion flags also may not be used. Optimizations must generate correct code and improve performance on a class of programs, SPEC's source code cannot be changed (making it impossible to add hints) and contrary to popular belief, you can't manually change the machine output. Need I go on?

Considering IBM's preliminary SPEC scores for the PPC970, I doubt the scores would be much higher.

jettredmont · Aug 1, 2003

Re: Re: Re: Re: Re: Re: Re: Renderman

Originally posted by tychay
Not true. Show me that "using the best compiler" is "fair"--most people think that doing so is part of what is wrong with benchmarking in general.

The whole idea of SPEC is to say:

"You (who are about to spend $20-30k+ on a server and thus have the time and money to buy the best compiler out there and use the right switches): this is the best performance you can get on this particular hardware."

No, SPEC benchmarks have absolutely nothing to do with how fast MS Word runs or how fast Safari/IE runs, or even how fast Photoshop might run. SPEC is very server-oriented, and is based on the assumption that whatever you are going to be running on this investment will either be your own code compiled on the best compiler out there, or it will be someone else's code compiled on the best compiler out there.

Now, that's the crux of the problem for Apple: there's really no "great" compiler out there for PPC, certainly not for the G5! Yes, CodeWarrior traditionally bests gcc in performance, and perhaps that would hold true for the G5 as well (no guarantees), but even if CW could get a good 10% gain on gcc's best settings, the G5 will still end up beneath the high-end P4 boxes on SPEC.

So, Apple decided to go the "same compiler" route. At least, gcc sucks on Intel too (relative to Intel's compiler).

Now, yes, ICC does contain SPEC-specific optimization by most accounts. And, yes, that's what most folks would call "cheating". However, there's no denying that the P4 can and will (with Intel's compiler) run through the SPEC benchmarks really darned fast (a good bit faster than Apple was able to get it through using gcc).

Now, to give a bit of help to Apple:

1) Most production software is NOT compiled on ICC (it is generally compiled on MS's compiler, which still beats gcc in my experience but not as soundly as ICC can on SPEC). GCC is what most folks on Mac will be using (if not Code Warrior, which gets better results generally).

2) ICC isn't cheap. IBM's G5 compiler (were they to create one for public consumption) wouldn't be cheap either. GCC is really cheap.

3) While gcc has a heck of a lot of room for improvement wrt the G5, ICC ain't getting much better than it already is for the P4. In other words, the next generation of gcc might actually make the hardware run as fast as it should.

Is this what is wrong with benchmarks? Yeah, kind of. Even SPEC really falls apart on cross-platform comparisons. It's better than most, though.

On the other hand, while many deride "bake-offs", they do tend to be closer to what consumers should be looking at. How fast can I get my work done on Machine A vs Machine B? Hands-on testing is obviously the best way to judge this, but bakeoffs administered and defended by the authors of cross-platform software get pretty close (as opposed to bake-offs like we've seen in the past where Steve picks ten Photoshop filters where the G4 shines and shows how well the G4 does on those specific filters ...)

Show me the evidence that any version of GCC, is better on PPC than x86--most experts think the reverse is true and that, at best, GCC3.3 is a little closer to platform parity than earlier versions.

Which is exactly the problem: while gcc on PPC isn't more optimized for the hardware than gcc for Intel (I strongly agree with you there!), gcc for PPC is, well, pretty much all we've got. Aside from Code Warrior, of course, which isn't nearly as significantly getter as, say ICC is than gcc on Intel-based SPECs.

The only thing underhanded here is how so many news sites spread such ill-informed B.S. about Apple "juicing" their benchmark that now it is accepted conventional wisdom.

Ah, but conventional wisdom is never either conventional nor wisdom. Well, maybe conventional ...

tychay · Aug 1, 2003

Re: Re: Re: Re: Re: Re: Re: Re: Renderman

Originally posted by Cubeboy
From what I understand, IBM and Apple spent a considerable amount of time optimizing the scheduler for the PPC970.

Do you know exactly how much autovectorization improves Pentium 4 ICC performance over x87 only code in SPECfp?

Can you actually give *ANY* evidence of exactly *HOW* Intel might cheat with ICC?

No, I don't know the exact amount of how much autovectorization improves ICC. However, the latest GCC performance was roughly comparable with ICC until the point where SSE replaces the FPU. Also, the head of Pixar technology is quoted on Intel's website (see my earlier post that was actually on topic) as saying Renderman received a 50% gain vs. GCC. Since Renderman is highly FP-intensive I'd say that almost all the evidence points to the fact that the improvement is "significant" for floating point tasks and benchmarks.

The reason GCC doesn't autovectorize P4 code has either to do with the fact that no such optimization was submitted, or because the GCC team felt that the optimizations that were submitted went against the chief goal of portability. The latter seems as equally valid as the former because the FPU unit goes "off" when the SIMD unit turns "on" in x86 because of a physical limitation without generic design reason behind it.

BTW, how is Intel using ICC cheating, and when have I ever said so? It is perfectly legal and the accepted practice to use the best compiler/best optimization/custom OS to produce the highest SPEC when reporting. This is not cheating. It is, however, still misleading to report Apple's benchmark (which isn't true to SPEC's accepted practice) next to true SPEC results, just as it is misleading not to mention how Veritest is not doing a true SPECmark when benchmarking for Apple--that doesn't mean that Apple's benchmarks are "juiced", evidence points to the opposite being true.

This means that the G5 could have looked significantly better had IBM IntelliAge or a hacked version of CodeWarrior been used. As much of a gain as ICC on x86? No way! One is the definition of a mature platform; the other doesn't even have a system out the door! That is Apple's deception. As the person you quoted said after the WWDC keynote, "Anybody who purchased a computer based on benchmarks deserves to be taken for a few grand."

The quote you mention actually deals with the machine description file which describes "what the CPU looks like" to the compiler and the issue involved in the quote was how the numbers in it for the G5/970 weren't exactly in line with the actual numbers in the processor. The reason, I gathered, is that the grouping of 5 model used by the 970 to keep track of so many instructions in flight does not mesh well with the way GCC models what a CPU actually looks like. Hence the numbers have been tweaked to deceive the compiler to schedule more efficiently. I'd be hard pressed to believe that such a tweak won't be accepted by the GCC team since it is the CPU specific file being provided for the existing GCC compiler model. (This was along a different thread in which a number of rabid anti-PPC people were accusing IBM of catering to Apple's irrational obsession with secrecy by misrepresenting numbers in their preliminary MD files provided to GCC, which turned out not to be the case.)

This is a far cry from tweaking the GCC compiler model itself which is what the original poster implied! IBM has announced plans to introduce some optimizations which is up to the GCC team to accept or not. Since GCC's design goal is portability, the acceptance of such patches depends on how the patch is framed. Apple, in any case, will probably supply those patches with Developer Tools/XCode even if they are not accepted so there is little practical difference to the Mac developer whichever case turns out.

I don't think there is any evidence that the two tweaks that Apple actually put into GCC during the oft-quoted, much-maligned Veritest benchmark are related to changing the compiler model and would not be accepted as standard "what is necessary to make GCC portable to the G5" patches. However, it seems a lot of press would have you believe otherwise and further cloud the issue by misinterpreting quotes such as the one you give.

In any case, this example as well as the measily register space of the x86 FPU unit are both examples which prove the rule: the compiler plays a huge role in the benchmark and that different compilers have different design goals.

For someone like me, I only care about the GCC benchmark as a programmer/research (or a CodeWarrior vs. Visual Studio one if I'm running apps). A Java developer has different criteria. Obviously SPECmark is not for either of us. Pixar is the opposite because they can choose to compile their products on whatever platform they want, can actually afford to purchase ICC, and their codebase is similar to many of the operations in the SPECfp suite. The fact (not rumor!) that Pixar is migrating to Mac OS X workstations and the preliminary RenderMan benchmarks coupled with the reality that the 970 is an immature and unoptimized platform bode well for the future of the G5 PowerMac as a graphics workstation and the IBM PowerPC 970 blades (not Macs!) in the renderfarm.

But, to bring this discussion back on topic. There seem to me to be two Pixars: the Pixar that is a movie studio and has a huge renderfarm and workflow that involves internally-developed and externally-purchased products; the Pixar that sells RenderMan. The former is migrating to Mac OS X on the desktop and probably going to keep Intel P4 with Linux for rendering for quite some time; the latter is testing the waters to see if anyone else is interested in a Mac OS X version of their product.

Take care,

jettredmont · Aug 1, 2003

Re: Re: Re: Re: Re: Re: Re: Re: Renderman

Originally posted by Cubeboy
Actually, we already know that the MD file used was modified over default (which were wrong). So apparently, the "no tweaks over default during testing" statement is false. This leads me to doubt that their weren't any tweaks, especially considering Veritest specifically listed the version of GCC as a specific build as opposed to just GCC.

GCC tested Apple's GCC code. No, it wasn't pulled down off the Web because Apple couldn't submit their G5 gcc tweaks to the main branch until the G5 "existed"!

"Special tweaks" would be code that was specifically introduced for the testing, which will never make it into the mainline gcc branch. There is no evidence of such tweaks, and a flat denial from Apple that any such tweaks were made. Thus, your assertion that Apple must have made such tweaks anyways says a bit more about yourself than about Apple.

From what I understand, IBM and Apple spent a considerable amount of time optimizing the scheduler for the PPC970.

"The gcc scheduler is not really designed ideally for a processor like the 970 and the Power4 and others, and that's a lot of what the IBM and Apple teams have worked on".

GCC is not able to schedule for the Pentium 4 at all, this is really quite important mind you considering nearly all floating point code needs to be well scheduled.

WHAT??? If gcc were unable to schedule for the P4 you would not be able to run P4 code. That's just plain stupid.

Now, back to the amount of time Apple and IBM spent fine-tuning the G5 scheduling code: that amount of time absolutely pales in comparison to the amount of time spent (by IBM amongst others) fine-tuning the gcc scheduling for the P3 and P4!

Do you know exactly how much autovectorization improves Pentium 4 ICC performance over x87 only code in SPECfp?

Can you actually give *ANY* evidence of exactly *HOW* Intel might cheat with ICC?

Evidence? Do your homework. Intel's compiler picks up a few specific patterns for auto-vectorization. Such patterns just happen to be in bottleneck portions of SPECfp. And, they aren't abundant in user code.

You can call that a fortunate coincidence, or you can call it a cheat.

Compare the results of a gcc vs icc bakeoff on SPEC (massive differences) with the results of a gcc vs icc bakeoff on application-based server benchmarks (where they come out pretty much in a dead heat, only a slight margin of victory for icc), and you have yet another bit of evidence that Intel's compiler is geared towards the SPEC benchmark, not towards its users.

Cubeboy · Aug 1, 2003

Re: Re: Re: Re: Re: Re: Re: Re: Re: Renderman

Originally posted by jettredmont
GCC tested Apple's GCC code. No, it wasn't pulled down off the Web because Apple couldn't submit their G5 gcc tweaks to the main branch until the G5 "existed"!

"Special tweaks" would be code that was specifically introduced for the testing, which will never make it into the mainline gcc branch. There is no evidence of such tweaks, and a flat denial from Apple that any such tweaks were made. Thus, your assertion that Apple must have made such tweaks anyways says a bit more about yourself than about Apple.

Did I ever say "special tweaks"? NO, you just assumed I did and made a circular argument out of it right? That the MD file was modified obviously demonstrates that their was at least some modification during and after WWDC that weren't their before WWDC which was the whole point of my statement. You should at least understand that.

WHAT??? If gcc were unable to schedule for the P4 you would not be able to run P4 code. That's just plain stupid.

Since *WHEN* did not having a scheduler ever make it impossible to run code? My god, this is just so wrong.

Have you EVER looked at GCC's source code? Let's see, we have scheduling for Pentium, Pentium Pro, Pentium 2, Pentium 3, K6, K7, and K8 BUT NOT THE PENTIUM 4. Now it might be that Intel did not release enough information to write a remotely good scheduler description but THAT DOES NOT CHANGE THE FACT THAT THE PENTIUM 4 DOES NOT HAVE A SCHEDULER!

Now, back to the amount of time Apple and IBM spent fine-tuning the G5 scheduling code: that amount of time absolutely pales in comparison to the amount of time spent (by IBM amongst others) fine-tuning the gcc scheduling for the P3 and P4!

Okay this just flat out wrong, almost all the time, gcc is tuned by people from the CPU vendor for a particular CPU, Intel does NOT do this, they would rather spend their time tuning ICC and for good reason.

Honestly have you *EVEN* bothered to look at the ChangeLog? Let's see, we have a bunch of K7 only tuning, a bunch of K8 only tuning, some PIII only tuning, ALMOST NO P4 ONLY TUNING and trust me the P4 depends on optimizations ALOT MORE than any of these other cpus. Honestly where do you come up with this sheer and utter BS?

Evidence? Do your homework. Intel's compiler picks up a few specific patterns for auto-vectorization. Such patterns just happen to be in bottleneck portions of SPECfp. And, they aren't abundant in user code.

You can call that a fortunate coincidence, or you can call it a cheat.

I did do my homework, perhaps you should do yours.

http://www.aceshardware.com/Spades/read.php?article_id=25000196

http://developer.intel.com/technology/itj/q12001/articles/art_2.htm

Looking over the documentation, on ICC/IFC 5.0, SSE2 only improved performance 5% over x87 only code in SPECfp. Consider this, a 2 GHz Pentium 4a with code compiled by ICC 7.0 scored 4% better than the same Pentium 4 with code compiled by ICC/IFC 5.0, which according to the documentation would score 5% better than the same processor running x87 only code produced by the same compiler. Thus a 3 GHz Pentium 4 running ICC/IFC 5.0 thats not producing ANY packed/scalar SSE2 code would STILL score over 1000 in SPECfp. Now can you honestly tell me exactly HOW much auto-vectorization improves SPEC?

Compare the results of a gcc vs icc bakeoff on SPEC (massive differences) with the results of a gcc vs icc bakeoff on application-based server benchmarks (where they come out pretty much in a dead heat, only a slight margin of victory for icc), and you have yet another bit of evidence that Intel's compiler is geared towards the SPEC benchmark, not towards its users.

My god, have your EVER seen how a Pentium 4 performs on ICC and GCC, I suppose not, you pulled this little statement out of thin air just like the rest of your "FUD". Here, why don't I just end this little demonstration by showing you some benchies, and than you can **** okay? No really, I'm sick of all this crap thats being posted today.

http://www.willus.com/ccomp_benchmark.shtml?p1

http://www.coyotegulch.com/reviews/almabench.html

http://www.polyhedron.com/
(your going to have to browse around with this one.)

http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html

http://www.intel.com/software/produ...er_gnu_perf.pdf

http://www.aceshardware.com/read_news.jsp?id=75000387 (again, NO GCC DOESN'T SCHEDULE FOR THE P4)

GregA · Aug 1, 2003

Renderfarm cross platform?

How does a renderfarm work?

I would have thought a renderfarm has an application running on each computer in the farm, with a controlling application that distributes the work for each frame to an available system.

If so, couldn't the renderfarm have their App running on Linux/Intel, Sun, Mac, Linux/970 all simultaneously?

Does anyone know? Does the render farm have to be the same systems or just the same application running on different systems?

Greg
ps. If they had multiple systems, you'd get some good info on the comparative performance of all those systems - which ones render more frames per hour etc, versus hardware cost, versus support cost.

nagromme · Aug 2, 2003

Yes--a renderfarm can easily be a mix of several OS's. (As long as the render client software exists for all the computers.)

I've rendered to a mixed batch of Macs with a few PCs mixed in before.

sanjef · Aug 2, 2003

Not likely

If current news about developers' trends is any indicator, we'll most likely find them dumping OS X in favor of something else (Linux?).

Fitzcaraldo · Aug 2, 2003

Renderman uses an app called "Alfred"TM to detect free resources and distribute render tasks.

Alfred talks to "Nimby"TM wich is used to limit and or deny resource use on specific workstations.

The biggest obsticle I see to Pixar and the Mac is the lack of Maya Unlimited...

The Pixar render farm is a Mix Match of various bits on Industrial shelving...

Go to:

http://www.pixar.com/howwedoit/index.html

Pull the lever and you will see the farm

You can also see the Renderman logo in B/W which I have seen 'Somewhere' it was the B&W in red on a yellow organic shape.

websterphreaky · Aug 2, 2003

What BS! Cost to high for a busines!

This is the most rediculous BS Rumor I've seen recently! Like a business as cut throat as the movie industry would allow a company to dump hundreds of thousands, maybe millions of dollas worth of expensive Intel hardware and server software, only to spend 3 times as much (because Apple hardware is 3 times overpriced) on new Apple servers and server software!

I do professional video and multimedia editing, and there is no way our company would do such an irresponcible financial move as this after one year in this economy!

Perhaps, it's the CEO got a new Apple 17" Ironing board PowerBook and he needs to migrate for his chipped up, scratched and flaking paint G4 Titanium. Now that's reality!

tychay · Aug 2, 2003

Getting really off topic...

I know you weren't replying to me, but I am reading your comment at the end to imply that I am responsible for "some of the crap that is being posted today".

Originally posted by Cubeboy
Did I ever say "special tweaks"? NO, you just assumed I did and made a circular argument out of it right?

Actually, A different poster referred to "special tweaks". If you look at the previous postings you will note the exact term he used was that GCC has "a ton of special tweaks written by chip vendor IBM." (a pretty damning quote!) I disputed that, you jumped on me, a benchmark war ensued, the guy you're flaming defended my statements and got confused as to who said what.

This mistake is understandable as you took what I said out of context when flaming me.

That the MD file...

My position on the machine description file has already been set. The original arguments I've seen took IBM to task on an incorrect machine description file because it was believed that changes in the file were deliberate FUD on behalf of Apple, it had nothing to do IBMs right, as a chip vendor, to modify a MD file to improve performance. This was later misinterpreted when the "oft-cited, much-maligned" Veritest benchmark war. It turned out to be that the file needed to be tweaked because gcc's CPU model is not a good model for the 970. IBMs explanation for the late change sounds very reasonable--their explanation was roughly equivalent to "we were tweaking it because a MD file isn't as easy as looking at a spec sheet as these people would have you believe."

(You and I both know that there is a world of difference between tweaking a machine description file and tweaking the GCC compiler's CPU model itself. We both know that the P4's MD file has far more commits than the 970's.)

scheduling stuff deleted

I ignored this stuff because it is true and I've never disputed this. We don't know why an optimal scheduler isn't available for the P4 in GCC. There is no need to shout about this and drive an honest discussion into the uncivil.

Even if such a thing is submitted we don't know what the improvement will be. I'm inclined to believe that there are a number of factors in addition to the lack of documentation coming out of Intel. Remember, the P4 of today is a completely different beast inside than the earlier models (hyperthreading, etc): these might not map well to the GCC CPU model either.

The difference is that IBM is committed to convincing GCC to improve the model and Intel isn't. This may mean that in the future GCC may become biased toward PPC, but that doesn't refute the present reality that GCC is a better compiler for x86 than PPC. (I'm on record, many times, as believing that using GCC to "normalize out" the compiler for SPEC is misguided: SPEC is a system benchmark that tests CPU, memory, and compiler (and to little extent: bus, OS and other components)). It always bothered me that the average computer user has even heard of it when making a purchase.

Okay this just flat out wrong, almost all the time, gcc is tuned by people from the CPU vendor for a particular CPU, Intel does NOT do this, they would rather spend their time tuning ICC and for good reason.

Actually, his statement is correct. The amount of time spent tuning that Apple/IBM have done for the 970 does "pale" in comparison to the amount of tuning done on for the x86. As evidence, note how good the performance is of gcc3.3 vs. ICC for the P3. In the PowerPC/POWER world, gcc has not achieved close to parity with CodeWarrior or IntelliAge and there are a lot less developers working on and with gcc for the PowerPC. Again, this will change because IBM's stated commitment to open-source and Apple's obvious dependance on gcc as the only Objective-C compiler around (weighed against the GCC team's biases against accepting any changes which affect portability).

I'd even bet that more time was spent tuning for the P4 specifically than the 970 in gcc. The P4 has been out for a while and is the probably the second most used CPU (after the P3) for Linux. The problem here is a noticeable lack of documentation on how to go about doing such tuning. The trick of having hyperthreaded double the apparent number CPUs available to the kernel alone must have taken a good bit of time. It was a hack that has since been fixed, but that doesn't mean it didn't take a lot of time.

Looking over the documentation, on ICC/IFC 5.0, SSE2 only improved performance 5% over x87 only code in SPECfp.

Whoa, you (and AMDZone) are misreading your own cites. First, the 5% performance gain is specifically due to instructions added when they jumped between MMX/MMX2 (SIMD in Pentium II and III) and SSE/SSE2 (SIMD in Pentium IV). Second, the fact that there is a gain at all points to autovectorization being done. (Now I agree with you that I've been guilty of referring to autovectorization when I generically mean autovectorization, autoparallelization, and other CPU modelling optimzations.) Third, this performance gain will increase in later versions of ICC as the Intel folks figure out more places the new instructions create benefits.

My god, have your EVER seen how a Pentium 4 performs on ICC and GCC, I suppose not, you pulled this little statement out of thin air just like the rest of your "FUD".

Those are more examples that reinforce the Pixar statement (50% speed gain on P4 with ICC vs. GCC). The optimizer in ICC is really good (by those statements) and really mature (by the fact that Intel's compiler shows better results than GCC with even AMD's chips). I should note for the others not willing to sift through all your cites that there are a couple tests where GCC benchmarked in rough parity or better than ICC.

I never claimed that the ICC optimizations only benefit SPEC (others may have). My guess is the biggest gains are not AV or AP at all but are the use of a lookup table for trigonometric functions in ICC. LibMoto (A math library Motorola made for the PowerPC) used to do the same thing and would pump up the FP marks in old Mac benchmarks by 80%, but because of incompleteness of the tables, it would affect the stability of some video games which depended on the accuracy of the numbers. I find it doubtful that the GCC team would accept such changes even if they were offered.

Some of your cites actually reinforce a significant speed gain with ICC vs. GCC regarding P4 SPEC2000. That does reinforce my statement that 1) You cannot "normalize out" the compiler in SPEC benchmarking as Apple claims and 2) it is misleading to report Apple's SPEC numbers side-by-side with standard SPEC2000 benchmarks.

tychay · Aug 2, 2003

Re: What BS! Cost to high for a busines!

Originally posted by websterphreaky
This is the most rediculous BS Rumor I've seen recently!

Too bad for you, it's true.

Like a business as cut throat as the movie industry would allow a company to dump hundreds of thousands, maybe millions of dollas worth

The rumor never said that. You, like many others, don't understand the difference between desktop workstations and a renderfarm.

...of expensive Intel hardware and server software...

The new server (1024 Xeon CPUs in their renderfarm) hardware is leased. The server software is currently Linux and free so I hardly think of it as costing "hundreds of thousands" of dollars to migrate, which they're not.

Their desktop hardware is either Linux or NT. If the latter, it is definitely "not free". Also their workflow scripts were originally written for IRIX and porting to NT must have caused a big headache if that were the case. Porting to Mac OS X is relatively painless and the painful parts make a subject of their talk at Mac OS X Developer Conference in two months.

...only to spend 3 times as much (because Apple hardware is 3 times overpriced) on new Apple servers and server software!

Doh! Obviously this is flame by a Wintel idiot. I've been punked!

The rumor never claimed that they will be buying any Apple servers or Mac OS X Server licenses (their server software).

I do professional video and multimedia editing, and there is no way our company would do such an irresponcible financial move as this after one year in this economy!

Read as: I'm losing money now that all those iMovie/iDVD 'doits are undercutting what I overcharge for wedding videos I put together with Premiere on my PeeCee.

Perhaps, it's the CEO got a new Apple 17" Ironing board PowerBook and he needs to migrate for his chipped up, scratched and flaking paint G4 Titanium. Now that's reality!

The same person who used a IBM Thinkpad his first two years as CEO of Apple? The same person who is nowhere to be seen at Pixar's offices just across the Bay that the Pixar folk joke about it in their DVD extras?

(And yes, I happen to own a scratched and flaking paint G4 Titanium Powerbook like the one you mention. If, during the life of it, the only complaint I have with this thing is cosmetic, I don't know whether to be happy or pissed off that I bothered to buy AppleCare.)

Cubeboy · Aug 2, 2003

Re: Getting really off topic...

Originally posted by tychay
I know you weren't replying to me, but I am reading your comment at the end to imply that I am responsible for "some of the crap that is being posted today".

My original thought was that the final statement would be the perfect clincher to a otherwise mediocre post. Please excuse my ignorance to the otherwise readily obvious.

Actually, A different poster referred to "special tweaks". If you look at the previous postings you will note the exact term he used was that GCC has "a ton of special tweaks written by chip vendor IBM." (a pretty damning quote!) I disputed that, you jumped on me, a benchmark war ensued, the guy you're flaming defended my statements and got confused as to who said what.

This mistake is understandable as you took what I said out of context when flaming me.

Personally, I consider my original reply to be quite mild in light of my more recent posts, most of it was just stating my views on the matter, the last two parts were actually more related to a one of our previous disputes than what you might call "flaming". I agree with you that the mistake is quite understandable, I just don't believe that (falsely) accusing someone in a particularly condescending tone should be held in the same light as a "understandable" mistake. Keeping that in mind, would you consider my response to be too strong?

My position on the machine description file has already been set. The original arguments I've seen took IBM to task on an incorrect machine description file because it was believed that changes in the file were deliberate FUD on behalf of Apple, it had nothing to do IBMs right, as a chip vendor, to modify a MD file to improve performance. This was later misinterpreted when the "oft-cited, much-maligned" Veritest benchmark war. It turned out to be that the file needed to be tweaked because gcc's CPU model is not a good model for the 970. IBMs explanation for the late change sounds very reasonable--their explanation was roughly equivalent to "we were tweaking it because a MD file isn't as easy as looking at a spec sheet as these people would have you believe."

(You and I both know that there is a world of difference between tweaking a machine description file and tweaking the GCC compiler's CPU model itself. We both know that the P4's MD file has far more commits than the 970's.)

I think you misunderstood what I have said in my previous two post. The entire point of me including the MD file in the debate at all was to prove that there were indeed modifications in GCC that were there during and after WWDC that weren't there before. The intent was to prove a particular statement false which is why I never discussed the MD file in detail.

I ignored this stuff because it is true and I've never disputed this. We don't know why an optimal scheduler isn't available for the P4 in GCC. There is no need to shout about this and drive an honest discussion into the uncivil.

Even if such a thing is submitted we don't know what the improvement will be. I'm inclined to believe that there are a number of factors in addition to the lack of documentation coming out of Intel. Remember, the P4 of today is a completely different beast inside than the earlier models (hyperthreading, etc): these might not map well to the GCC CPU model either.

Well, I never shouted about the scheduler in the first place so I don't really know exactly what your getting at:

"From what I understand, IBM and Apple spent a considerable amount of time optimizing the scheduler for the PPC970."

I did type in CAPITALIZED LETTERS in my response to Mr Redmont's flames and false accusations, if thats what you mean, I assumed it was acceptable for the occassion.

Again, the reason I believe a "optimal scheduler" isn't available for the Pentium 4 goes back to my original point, which I will be discussing shortly. Having well scheduled code is really quite important performance-wise, this is especially true for floating point code which ironically (or coincidentally), is where the Pentium 4 took the biggest performance hit over ICC.

The difference is that IBM is committed to convincing GCC to improve the model and Intel isn't. This may mean that in the future GCC may become biased toward PPC, but that doesn't refute the present reality that GCC is a better compiler for x86 than PPC. (I'm on record, many times, as believing that using GCC to "normalize out" the compiler for SPEC is misguided: SPEC is a system benchmark that tests CPU, memory, and compiler (and to little extent: bus, OS and other components)). It always bothered me that the average computer user has even heard of it when making a purchase.

Once again, this goes back to my original point, Intel is not commited to "improving the model" for GCC and hasn't been since it released ICC 5.0 which came out in the same time frame as the Pentium 4. I will be discussing this more in detail in the paragraph below. I don't believe most average computer users have heard of SPEC, at least not until WWDC where it was put on display.

Note that I segmented my response to your post into two parts as a single large post would exceed the word count.

Cubeboy · Aug 3, 2003

Actually, his statement is correct. The amount of time spent tuning that Apple/IBM have done for the 970 does "pale" in comparison to the amount of tuning done on for the x86. As evidence, note how good the performance is of gcc3.3 vs. ICC for the P3. In the PowerPC/POWER world, gcc has not achieved close to parity with CodeWarrior or IntelliAge and there are a lot less developers working on and with gcc for the PowerPC. Again, this will change because IBM's stated commitment to open-source and Apple's obvious dependance on gcc as the only Objective-C compiler around (weighed against the GCC team's biases against accepting any changes which affect portability).

I'd even bet that more time was spent tuning for the P4 specifically than the 970 in gcc. The P4 has been out for a while and is the probably the second most used CPU (after the P3) for Linux. The problem here is a noticeable lack of documentation on how to go about doing such tuning. The trick of having hyperthreaded double the apparent number CPUs available to the kernel alone must have taken a good bit of time. It was a hack that has since been fixed, but that doesn't mean it didn't take a lot of time.

Yes, more time was spent tuning for x86 (note that GCC can schedule for almost every x86 processor) but that doesn't mean more time was spent tuning the Pentium 4. As I've stated in my previous post, almost all GCC optimizations are submitted by people from the CPU vendor for a particular CPU. You've agreed that Intel isn't very commited to submitting improvements to GCC. I pointed out they haven't been commited since releasing the Pentium 4 and ICC 5.0 and this neglect resulted in the Pentium 4 being very poorly optimized for GCC. This was proven from my reference to the ChangeLog, which shows lots of tuning for other x86 processors like the Opteron and Athlon (which might explain why a Opteron running GCC compiled code scores within 5% of the same chip running ICC compiled code in SPEC) but almost no tuning for the Pentium 4.

Regarding hyperthreading, it's not just modified kernels that "sees" two logical processors on a Pentium 4, most software will "see" the same thing. SMT or hyperthreading (parallel set of registers and logic) basically "creates" two "virtual" cores that run independently on a single chip. Now, these "virtual" cores do share alot of the same cpu components (cache, ALU(s), FPU, SIMD unit, code-decode unit, hence the serious resource contention problems), but again, two seperate independent cores.

Whoa, you (and AMDZone) are misreading your own cites. First, the 5% performance gain is specifically due to instructions added when they jumped between MMX/MMX2 (SIMD in Pentium II and III) and SSE/SSE2 (SIMD in Pentium IV). Second, the fact that there is a gain at all points to autovectorization being done. (Now I agree with you that I've been guilty of referring to autovectorization when I generically mean autovectorization, autoparallelization, and other CPU modelling optimzations.) Third, this performance gain will increase in later versions of ICC as the Intel folks figure out more places the new instructions create benefits.

Here's the quotes:
"For SPECfp2000 the new SSE/SSE2 instructions offer about a 5% performance gain compared to an x87-only version."

"As the compiler improves over time the gain from these new instructions will increase."

I don't really see the point of this dispute, x87 code is regarded as standard when benching these CPUs (Athlons, Pentium 4s and Opterons all have x87 FPUs), the main concern was that the Pentium 4 could run packed SSE2 code produced by ICC much faster than than x87 only code (and this is one of the more widely believed reasons as to why the P4 running GCC compiled code fared so poorly in SPECfp), which I've just proven to be false.

I never doubted that ICC didn't have auto-vectorization (I made this clear when I asked you how much autovectorization boosted performance in my first response). I was questioning if it could account for a significant boost in SPECfp scores.

I've already shown in previous threads and my previous post that the latest version of ICC (7.0) only improved overall SPECfp score 4% over ICC 5.0, and this is from official SPEC submissions. Either way, the gain is not that significant.

Those are more examples that reinforce the Pixar statement (50% speed gain on P4 with ICC vs. GCC). The optimizer in ICC is really good (by those statements) and really mature (by the fact that Intel's compiler shows better results than GCC with even AMD's chips). I should note for the others not willing to sift through all your cites that there are a couple tests where GCC benchmarked in rough parity or better than ICC.

I never claimed that the ICC optimizations only benefit SPEC (others may have). My guess is the biggest gains are not AV or AP at all but are the use of a lookup table for trigonometric functions in ICC. LibMoto (A math library Motorola made for the PowerPC) used to do the same thing and would pump up the FP marks in old Mac benchmarks by 80%, but because of incompleteness of the tables, it would affect the stability of some video games which depended on the accuracy of the numbers. I find it doubtful that the GCC team would accept such changes even if they were offered.

Some of your cites actually reinforce a significant speed gain with ICC vs. GCC regarding P4 SPEC2000. That does reinforce my statement that 1) You cannot "normalize out" the compiler in SPEC benchmarking as Apple claims and 2) it is misleading to report Apple's SPEC numbers side-by-side with standard SPEC2000 benchmarks.

True, there were a few statements that support evidence of a trigonometric functions lookup table and auto-vectorization (which I've already acknowledged from the start) but the general feeling and conclusions of all the cites I've listed was that ICC produced significantly faster code for the Pentium 4 than GCC, which was the main reason I included them.

Regarding ICC's lookup table, would a Opteron/Athlon recieve the same benefits by using ICC? This is considering a Opteron running ICC compiled code offers only a marginal increase in SPEC score to the same cpu running GCC compiled code.

To what degree would it change the machine output? I know that SPEC is very strict about making sure that the output for their benchmark suite remains unchanged.

sparkplug · Aug 5, 2003

workstations? nah

If they moved to OSX it would only be for workstations.

This is highly unlikely, what front end hardware are they going to run? The only 3D gfx cards available are gaming cards, IE geforces and Radeons. That is all that is available for the mac, their are no pro 3d gfx for the mac period. In particular no Quadro FX, no quadro at all for that matter, so no Cg shaders. They would make decent render engines however, but no pro animator is going to want to cripple there workflow by using a "workstation" that can only use a game card.

Maya 5 with real time rendering/Cg shaders on a Quadro fx is a thing of joy. You cant get that on the G5, for any amount of money. Basically there are three things needed for a professional 3d workstation

1/ a stable multitasking operating system
2/ fast cpu with plenty of memory, the more the merrier
3/ fast high quality/accurate graphics subsytem

The G5 has the first two.......

patrick0brien · Aug 6, 2003

Re: workstations? nah

Originally posted by sparkplug
The G5 has the first two.......

-sparkplug

That's a very good point.

Write to this link: http://www.apple.com/macosx/feedback/

Perhaps we can get Apple to write us the drivers so we could use such cards.

Mac Kiwi · Aug 6, 2003

I am almost convinced that Pixar already has these drivers to tell you the truth.As to when the rest of us can get them would be nice.It would also be nice to know who writes them Apple or Nvidia etc.A nvidia driver writer tells me Apple do,but I have a report that an ATI guy told someone they do,so who knows.I dont really care who writes them I want a Quadro.

I was also told by a friend today that works for a game company that Pixar have ordered hundreds of G5s.I cant back this up with any facts other then a friend in a game company,but he seemed fairly sure.I am a little skeptical myself but I suppose we will see soon enough.If they have ordered hundreds it might explain some of the delay for the rest of us as well.

Stu.

capitalhood · Aug 6, 2003

wait...

are you all saying that pixar doesnt use macs?! but how can that be... pixar has steve so whyt dont they use macs... plus there always doing **** with apple... the jaquar logo... the g5 video... all kinds o' stuff... woow.

patrick0brien · Aug 6, 2003

Re: wait...

Originally posted by capitalhood
are you all saying that pixar doesnt use macs?! but how can that be... pixar has steve so whyt dont they use macs... plus there always doing **** with apple... the jaquar logo... the g5 video... all kinds o' stuff... woow.

-capitalhood

At the risk of sounding blunt: Have you read the entire thread? All of the information to answer your questions is here.

agreenster · Aug 6, 2003

Originally posted by Fitzcaraldo
The biggest obsticle I see to Pixar and the Mac is the lack of Maya Unlimited...

No, not really. Pixar only uses Maya for modeling, not animation. They have proprietary software called 'Marionette" (spelling?) for animation. Anything that Unlimited would do (fur, liquid) is all done in house via renderman and their effects department.

Originally posted by sparkplug Maya 5 with real time rendering/Cg shaders on a Quadro fx is a thing of joy

Are you speaking from experience? If so, I am totally jealous.

Actually, check out this website--it hints that nVidia is working on developing their vidCard and all its corresponding coding for OSX...

sparkplug · Aug 7, 2003

take me to your quadro

Thanks for the link, yes thats a good overview. Yes I am speaking from experience, I've been using Cg shaders since the beta and am completely in love with there integration in maya5 and with the FX. This is "must have" technology for me, and indeed I would think for many many people once they have used it and seen what can be done, It wasnt very long ago when this level of graphics could only be purchased as a small fridge sized unit minimum. I have some of that stuff knocking around collecting dust.

I wrote the author of that article, hopefully he can tell me something concrete because there is no word from any of my other sources on the existance of such support, Id be very happy to learn that there was.

MacRonin · Aug 15, 2003

Bump?!?

shiftysands · Dec 2, 2004

Could the unnamed client be Pixar?

Does anyone know what a "hybrid grid" means?

Hi Miguel,
We are currently deciding between a high end NVidia card, a established hardware raytracing solution and a hybrid grid rendering solution. The likelihood is that I will ship with all of these since they each have their good points.

Fireball 999 is developing an imaging solution for a private client - while retaining design rights to the system and software for sale to other imaging companies (This resale facility becomes available in mid 2006 to give the main client time to capitalise on their investment in the design and software).

The problem dynamic is one of integrating studio technology under a unified grid. This grid will enable a company of 800 employees to work seamlessly on the production of a 90min animated and live action hybrid film. The client wants to keep a common format and workflow between all stages of the production environment and is leveraging the best in consumer level hardware to achieve professional calibre work (this maintains a symbiotic relationship between the mass production and low cost of the consumer market and the high demand top fidelity requirements of my client).

The initial imaging solution will ship for around $12million as a suite of several computers each with a CRT monitor and TV. These computers link together to form the skeleton of the rendering grid in each studio and link directly to an offsite render farm (Provided by Fireball 999 and paid for by the client on a per TeraFragment rate). I would be looking to include a couple of your highend nvidia cards in each computer along with the hardware raytracing card and the grid rendering software.

Other features of the package are optional and include motion capture, laser scanning booths and camera rigs, but each of these options dramatically increases the cost of the imaging solution.

We am hoping that falling technology prices will bring this imaging solution in the reach of the high-end production TV studio's budgets by 2006 at a price tag of under $3million per studio, and within the decade we would like to see this package become the norm for game studios at the merry price of $0.2million.

Ideally one card would output straight to DVD resolution approx 800x1200, 16xAA with bump mapping, spherical harmonics and specular highlights (implimented as HDRI map) (24fps) and the other would output to TV resolution with interlaced fields 50fps.

Please recommend which cards may be appropriate (to ship in May 2005).

The current business plan has a target of 50 sales over the next 5 years, but each sale would be resourced and built to the customers exact requirements - at this stage I am only putting together the prototype systems to test the software on (The software has finished architecture and initial infrastructure but we have not licensed the rendering or modelling software yet (probably Mental Ray and Maya)).

I want a quote from you per system for the dual output system (DVD and TV resolutions). If this is not something NVidia has an interest in please recommend a specialist engineering firm that may be able to help with an original hardware solution.

many thanks,

Tabitha Ben
Biz Dev
Fireball 999

Pretty much confirmed: Pixar to switch

macrumors regular

macrumors regular

macrumors regular

macrumors 68030

macrumors regular

macrumors 68030

macrumors regular

macrumors 65816

macrumors G5

macrumors newbie

macrumors member

macrumors newbie

macrumors regular

macrumors regular

macrumors regular

macrumors regular

macrumors newbie

macrumors 68040

macrumors 6502a

macrumors newbie

macrumors 68040

macrumors 68000

macrumors newbie

macrumors member

macrumors newbie

Our Staff