Macs and benchmarking; or, challenging the "MHz myth"

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
There has been discussion lately in several different threads about the validity of benchmarks (Photoshop, SPEC CPU, etc.) for testing sytem performance. SETI@home has come up lately, and seems to be a fair standard for each system (please correct me if I'm wrong, and the Mac "cheats" by using Altivec ;) )

So, what do you use to stress test/benchmark your systems?

PC users, what kind of standard can you accept, so that we can get past all the "I don't use Photoshop, and I don't encode video, so those results don't mean anything to me" BS?

Let's try to find a definitive application and settle this for once and for all.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Standard seti@home style benchmarking app

I personally have been thinking of designing a program that would do something similar to seti@home - perform thousands of CPU-intense calculations and send the results to a server with a huge database, only it would use the same set of calculations each time, therefore making it more accurate than seti@home since it uses sets of data of varying sizes. As i said in the "Itanium 2 + P4" thread, my 867 did a unit in 16 minutes. However, its obvious it didn't complete a full unit, the unit it used simply had far too much interference so it aborted. However, it went on the record as a completed unit. So seti@home has its downsides. If a windows comp got a unit with the same amount of interference as that one, it could finish in 16 mintues too. So I suggest a program that receives instructions from a server, completes them and then gives back a score. It would also send the results to the server for a database of CPU speeds, so we could show windoze nerds that we really do kick a** :D Too bad i'm still stuck on that d*** "hello world" chapter in learning cocoa ;)
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
bob-
Good luck with that app!

I think that by using SETI's database we mitigate the short time/unit values caused by aborted units.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
thanks, maybe i can recruit some of the smarter and more experienced guys on here to help me out - it would most likely speed the process :p The biggest problem with that idea is having to write it for lots and lots of different types of processors and platforms, especially when i can't write for my native platform. I'm better as a thinker lol
 

mcrain

macrumors 68000
Feb 8, 2002
1,772
11
Illinois
As a fairly intelligent computer shopper, I can say one thing for certain, SETI makes no sense. A 386 is faster than a mac? What do all the numbers mean? What a crock of poo. That site means nothing.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by mcrain
As a fairly intelligent computer shopper, I can say one thing for certain, SETI makes no sense. A 386 is faster than a mac? What do all the numbers mean? What a crock of poo. That site means nothing.
Actually, if you paid attention to what you're looking at, it's the platform page, not the CPU page, which is here:
http://setiathome.berkeley.edu/stats/cpus.html

The 386 numbers are referring to different linux distributions.
 

Beej

macrumors 68020
Jan 6, 2002
2,139
0
Buffy's bedroom
Whoah! I gotta get me one of those "alpha-compaq-T64Uv4.0d/EV67" things! They come in at 47, with an average unit time of 1 hour and 0.4 sec.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by Beej
Hmmm, how does SETI know what kind of proc you have? Cuase we could create a Mac Rumors proc and jump up to 22nd pretty quickly!
I'm pretty sure that it collects the information from your system when you first login.

As to why there are some very random AMD CPUs there...I don't know. Crazy overclockers must tweak their info somehow.:rolleyes:
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Windoze is easy to change proc info

It's very, very simple on windoze to change the name of the processor the system uses. In windows 95, at least, all you have to download is a copy of Winhack 95. I'm sure there are versions for the newer versions of windoze. That's why you see names like "Hacintosh" (at least spell it "hack" :rolleyes: )
 

Catfish_Man

macrumors 68030
Sep 13, 2001
2,579
1
Portland, OR
I think...

...that someone should create a "General Performance Indicator". Here's what it would do:

Test the speed of each common type of task (2d graphics, 3d graphics, copying files, etc...) relative to a base machine. So if the base machine was a G4 400, and an G4 800 went twice as fast in photoshop (or whatever other 2d program/programs we decided to use) it would get a score of 2. Then, when you've run all the tests you average those for a general score. This would give you a general idea of how fast the computer was relative to others. It would not tell you how fast the cpu was, or the ram, or the HD, or the OS, it would just tell you how fast the whole system was. This wouldn't have the flaws of SPEC, or FLOPS, or single app tests.

btw, why is Altivec cheating? It's how the G4 is so fast, it should be able to use it.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Windows has the Sisoft Sandra benchmark which tests every aspect of the machine's performance. We need something similar for the Mac, or at least something that is indisputably "fair" by everyone's standards.
 

Choppaface

macrumors 65816
Jan 22, 2002
1,187
0
SFBA
Originally posted by Rower_CPU
Thanks to GeeYouEye we now have a pretty definitive table of system performance based on OS:
http://setiathome.berkeley.edu/stats/platforms.html

Windows = 21 hr 03 min 21.3 sec
Mac OS X = 18 hr 10 min 17.5 sec
those numbers probably reflect the fact that the 5% to maybe even 10% of their users that use macs use newer macs, while most of the other PC users probably have pretty old machines. the same science type that gets to use an ibook at school might have a 200 mhz PII at home.

also, somebody said here somewhere that their dual gig G4 can do 2 units every 5 to 6 hours i think. my dual athlon 1900+ XP does one unit using the GUI client (havent gotten around to setting up 2 clients) does one unit in 4.5 hours average, and its gotten as quick as 3.5 hours on one of the shorter ones. so the athlon beats the G4 here for time/processor.

I also run 2 copies of the SETI command line client on my dual 500 G4, which gets 10-11 hours per work unit per client. so the 1900+ athlon is 2.3 times faster than my dual 500 G4 with those times. considering that it is almost 18 months older than my dual 500 G4, it shows that althons at the time probably had a bit of an edge on the G4 in terms of SETI, as it seems they do now.

as for comparing the 2 systems, thats very hard to do. either you can have both machines executing the same kind of code, or you can have them running the same app but each having a copy that's optimized for its hardware. but then you could also try comparing different apps, like itunes versus winamp or something, that do the same thing (mp3 ripping). so one way or another, your test is going to have some holes in it. the best test is probably having somebody get all their favorite apps together, putting them on both platforms, and then having them subjectively grade which box responds better to their needs. maybe even have a set of tasks for them to do and put the stopwatch to them while they do their tasks. my point is that it doesnt matter what the magazines right in their speed tests, it really comes down to what is right for the user.

IE is BLAZINGLY fast on my PC compared to my dual 500. it really runs circles around it....I'm getting pretty much instantaneous page loads even with SETI on in the background. so should I start doing all my internet stuff on my PC? no. why? I'm the kind of person who likes to have tons of windows open, especially when I'm on boards like these. I open a different thread in a different window, and I find it really hand how you can do this quickly by command-clicking on the link. in IE6, I have to right click, find the 'open in new browser window', then I usually have to either drag that window out of the way or click the button to stop it from being full screen. and then there's the scrollwheel. I kinda like it, but I find I have a *lot* more accuracy using the command-drag feature in mac IE. so it turns out that it usually takes me just as long to do my net browsing on my PC that on my mac. and even if I had 'grown up' on a PC, i would still be opening tons of windows because that's just how I like to do things. so it really depends on the user, and wha they like. don't expect to throw a mac-using PS expert on a PC and have them work faster on the PC that they do on their mac because the PC is faster.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by Choppaface
those numbers probably reflect the fact that the 5% to maybe even 10% of their users that use macs use newer macs, while most of the other PC users probably have pretty old machines. the same science type that gets to use an ibook at school might have a 200 mhz PII at home.
Well, how about the numbers given for the Mac platform in general (probably meaning OS 9 and earlier) : 17 hr 43 min 56.7 sec. Besides, I would argue that it would be just the opposite, Macs owners keep their Macs longer since PCs go obsolete much faster.

also, somebody said here somewhere that their dual gig G4 can do 2 units every 5 to 6 hours i think. my dual athlon 1900+ XP does one unit using the GUI client (havent gotten around to setting up 2 clients) does one unit in 4.5 hours average, and its gotten as quick as 3.5 hours on one of the shorter ones. so the athlon beats the G4 here for time/processor.
And my G4 400 beats my 1.33 GHz Athlon...you have to go by sheer numbers here, and that's what the SETI database gives you.

I also run 2 copies of the SETI command line client on my dual 500 G4, which gets 10-11 hours per work unit per client. so the 1900+ athlon is 2.3 times faster than my dual 500 G4 with those times. considering that it is almost 18 months older than my dual 500 G4, it shows that althons at the time probably had a bit of an edge on the G4 in terms of SETI, as it seems they do now.
The G4 in my Tibook came out last January, and the 1.33 T-bird in my PC came out last summer...what's your point? Actually, your math is backwards. The DP 500 came out in summer 2000 and the 1900+ came out earlier this year. The Athlon is NEWER, not older. Since the Athlons were about half that speed in 2000, the G4 500 would actually be the SAME speed as a comparable Athlon back then.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Originally posted by Choppaface


IE is BLAZINGLY fast on my PC compared to my dual 500. it really runs circles around it....I'm getting pretty much instantaneous page loads even with SETI on in the background.
Big surprise there :rolleyes: If apple made its own browser, built it into the system and had enough lawyers to fend off the DoJ like microsoft, we would most likely be getting instantaneous page loads after a few years of development as well. But seriously guys, if you check the CPU page at seti the athlons go a lot faster than the average mac. *shrugs* Although the mac number represents all macs, and not the procs. That's another reason I want to try my idea out.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by bobindashadows
But seriously guys, if you check the CPU page at seti the athlons go a lot faster than the average mac. *shrugs* Although the mac number represents all macs, and not the procs. That's another reason I want to try my idea out.
I agree that it's tough to glean meaningful information from these stats. After all, the PowerPC designation applies to chips all the way back to the 601s used in the Performa and first PowerMac computers. The same way the Intelx86 designation applies to chips all the way back to the early 80s, but probably means 386 and later.

Since the Athlon designation is newer and an indicator of faster chips, its results are skewed.

I don't think this benchmark tells us everything we need to know, but at least it does give us a good overall view of what's going on out there.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Originally posted by Rower_CPU


I agree that it's tough to glean meaningful information from these stats. After all, the PowerPC designation applies to chips all the way back to the 601s used in the Performa and first PowerMac computers. The same way the Intelx86 designation applies to chips all the way back to the early 80s, but probably means 386 and later.

Since the Athlon designation is newer and an indicator of faster chips, its results are skewed.

I don't think this benchmark tells us everything we need to know, but at least it does give us a good overall view of what's going on out there.
This is why we need a standard program that measures all the segments of processor ability. The program would probably measure speeds with separate 32- 64- and 128- bit sections for a more accurate look into the speeds of the processors. The biggest problem with seti@home is that the results can be very skewed depending on sizes and aborted units.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by bobindashadows
This is why we need a standard program that measures all the segments of processor ability. The program would probably measure speeds with separate 32- 64- and 128- bit sections for a more accurate look into the speeds of the processors. The biggest problem with seti@home is that the results can be very skewed depending on sizes and aborted units.
Don't you think that the aborted/small units average out over time, since we're looking at millions of results here?
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
I don't mean x86

Originally posted by Rower_CPU


Don't you think that the aborted/small units average out over time, since we're looking at millions of results here?
I'm not referring to the x86 numbers, but the more precise Athlon numbers (which are obviously the biggest threat to Apple's performance) in which there are much less units completed. Here's an example:

The AMD Athlon XP 1700+ has 160 units completed, over 617 hours and 9 minutes. Say that 10 more units were aborted after 10 minutes under that processor. At first, the average time was 3 hours and 51 minutes. The new time is 3 hours and 38.4 minutes. that's to say if new aborted ones were added. How many do you think were aborted already under the 1700+?
 

Gelfin

macrumors 68020
Sep 18, 2001
2,166
4
Denver, CO
Originally posted by bobindashadows
thanks, maybe i can recruit some of the smarter and more experienced guys on here to help me out - it would most likely speed the process :p The biggest problem with that idea is having to write it for lots and lots of different types of processors and platforms, especially when i can't write for my native platform. I'm better as a thinker lol
Speaking as one of the more experienced guys (I actually have benchmarking on my resume), coming up with a standard, platform-independent way to rigorously test system performance is practically impossible. You're talking about something that the true experts in the field spend their entire careers refining and debating.

If you're looking for a simple program that generates a number by which you can unequivocally claim, "my computer is faster than yours," then you get that from marketing people, not performance analysts. Performance analysts cringe when people start touting ad hoc wall clock measurements of applications as meaningful benchmarks. On the other hand, the more specific you make a test, the less it applies to real-world applications, and the easier it is to tailor a system that performs well on that specific test and perhaps nothing else.

Reliable performance analysis sounds like a pretty easy proposition, but it's actually unbelievably difficult to do well. And boring besides, which is why I don't do that stuff anymore.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Originally posted by Gelfin


Speaking as one of the more experienced guys (I actually have benchmarking on my resume), coming up with a standard, platform-independent way to rigorously test system performance is practically impossible. You're talking about something that the true experts in the field spend their entire careers refining and debating.

If you're looking for a simple program that generates a number by which you can unequivocally claim, "my computer is faster than yours," then you get that from marketing people, not performance analysts. Performance analysts cringe when people start touting ad hoc wall clock measurements of applications as meaningful benchmarks. On the other hand, the more specific you make a test, the less it applies to real-world applications, and the easier it is to tailor a system that performs well on that specific test and perhaps nothing else.

Reliable performance analysis sounds like a pretty easy proposition, but it's actually unbelievably difficult to do well. And boring besides, which is why I don't do that stuff anymore.
I'm not talking about real-world performance, i'm talking about a brute measurement of how fast a computer can do a hideously long task. I don't see the huge difficulty in it, honestly. Consider whenever I get in an argument with a windoze geek about why mac's are faster then he'll say "well i don't use that application, or that one, or that one, but mine has a faster bus, uses RDRAM, and runs at over twice the clock speed" we need a way to be able to compare two systems. Because we currently are literally unable to do this, since nearly every application has a few flaws that don't make it platform-independant or does the rigorous testing on the system.

In my above example, however, is the misconception of a mildly intelligent windoze geek. What about the average consumer? Right now, they follow nearly the most unreliable form of benchmark, Mhz. Instead of trying to combat the mhz myth in apple's keynotes, a new benchmark should be established (at least a little more reliable than Mhz) that could be used in magazines to help the average consumer make a buying decision.

The program would also be able to list a section that the computer is best for, like it would say for a G4 that it would have high numbers for graphics work but low numbers for word processing and average use stuff. The average pentium would have just the opposite, and this would help people make their buying decision better. I'm not saying it would be a perfect benchmark, which i know is impossible because there are so many components to a computer. This would just be a proc benchmark. A true benchmark would be able to guage hard drive read/write speeds, 2d and 3d graphics, internet up/down, PCI/AGP throughput, and a hundred other things. Unfortunately this would be an undertaking of mass proportions and would still be impossible to perfect. But the benchmark that we really need improved is the ability of the processor.
 

bobindashadows

macrumors 6502
Mar 16, 2002
419
0
Just to mention another thought that came to mind i forgot to say - the reason we need it to divide up the systems performance in certain groups is that we can't measure the computers performance in every single application so we say "ok this computer should do really well in graphics programs because it's 128 bit performance is this and that and this", and this would help get us the real world performance factor. Obviously this isn't the best indicator but i'm am merley suggesting we get a better system than megahertz.
 

Rower_CPU

Moderator emeritus
Original poster
Oct 5, 2001
11,219
0
San Diego, CA
Originally posted by bobindashadows
Just to mention another thought that came to mind i forgot to say - the reason we need it to divide up the systems performance in certain groups is that we can't measure the computers performance in every single application so we say "ok this computer should do really well in graphics programs because it's 128 bit performance is this and that and this", and this would help get us the real world performance factor. Obviously this isn't the best indicator but i'm am merley suggesting we get a better system than megahertz.
AMD is starting to do this by giving their CPUs a power rating instead of listing the MHz value...but their power rating still goes off of Intel's MHz value.

Since Intel is the leader in CPU production (at the commercial level) they are the trend setters that everyone else must follow. Their Itanium CPUs function at fewer MHz than the P4s, so maybe some day they will start to change their marketing strategy. But for now it suits them, because they have the fastest chip.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.