speed comparison within current Pro lineup

bernuli · Oct 24, 2012

Hello,

I was wondering if anyone who has access to a new mac pro with the dual processor configuration or 3.33GHz single recently could run a command for me.

The Apple Stores have in stock the single CPU 3.2GHz Quad-Core Intel Xeon. So I am able to test that configuration. The following command completes in 29.6 seconds on the low end Mac Pro.

uptime; time perl -e 'for (0 .. 1000000000) {}'

I would love to see how long this takes on the other available configurations.

B

derbothaus · Oct 24, 2012

6-core 3.33GHz, 12GB Memory

1st run:
real 0m30.146s
user 0m30.115s
sys 0m0.022s

2nd run:
real 0m29.558s
user 0m29.552s
sys 0m0.004s

No amazing insight I can see. Single thread crank.

thepawn · Oct 24, 2012

On my 2009 2.66ghz Quad:

13:02 up 23 days, 10 hrs, 2 users, load averages: 0.42 0.38 0.31
real 0m37.728s
user 0m37.477s
sys 0m0.006s
Tron:~ daniel$

bernuli · Dec 1, 2012

Not super amazing on its own.

But if you run 2 of these at the same time they should complete in the same amount of time. The store Mac Pro is a quad core and 4 at a time all complete in 30.38 seconds. When do 5 at a time, it slows to 37.07 seconds and down for there of course.

Could you do 5 at a time, then 6 then 7 for me? You can do multiple terminal windows or run the command with an & on the end so:

time perl -e 'for (0 .. 1000000000) {}'&

then press the up arrow and return 5 times quick

B

derbothaus said:
6-core 3.33GHz, 12GB Memory

1st run:
real 0m30.146s
user 0m30.115s
sys 0m0.022s

2nd run:
real 0m29.558s
user 0m29.552s
sys 0m0.004s

No amazing insight I can see. Single thread crank.

deconstruct60 · Dec 1, 2012

bernuli said:
The following command completes in 29.6 seconds on the low end Mac Pro.

uptime; time perl -e 'for (0 .. 1000000000) {}'

I would love to see how long this takes on the other available configurations.

Not sure why. It is basically going to tell you what the clock speed is of the CPU. You can find that on a data sheet. All you are doing is loading a loop up inside of the L1 cache and executing it. The vast majority of normal apps execute outside of the L1/L2/l3 cache. That's where the Mac Pro is differentiated; I/O outside of memory.

The Mac Mini that turbos up to 3.6GHz is probably the best bang-for-the-buck for this benchmark.

If the perl optimizer had any brains it would take almost 0.00 ms since it doesn't do anything. The whole loop can be optimized away.

deconstruct60 · Dec 1, 2012

bernuli said:
Could you do 5 at a time, then 6 then 7 for me? You can do multiple terminal windows or run the command with an & on the end so:

5 at a time only expands it to a L2/L3 cache problem. Still just measuring clock rate.

bernuli · Dec 1, 2012

deconstruct60 said:
Not sure why. It is basically going to tell you what the clock speed is of the CPU. You can find that on a data sheet. All you are doing is loading a loop up inside of the L1 cache and executing it. The vast majority of normal apps execute outside of the L1/L2/l3 cache. That's where the Mac Pro is differentiated; I/O outside of memory.

The Mac Mini that turbos up to 3.6GHz is probably the best bang-for-the-buck for this benchmark.

If the perl optimizer had any brains it would take almost 0.00 ms since it doesn't do anything. The whole loop can be optimized away.

I am not sure that faster clock speed directly relates to faster time. Maybe I should look at the spec sheets, but there is lots of info in those sheets, and I am trying to get a real world speed comparison. Though you seem to be pointing out that this test is not exactly real world.

B

Sirobin · Dec 1, 2012

This is on my 2012 12 core (2.66 Ghz) with 20 GB of RAM, all running at the same time.

Code:

real	0m32.695s
user	0m32.673s
sys	0m0.018s

real	0m32.751s
user	0m32.715s
sys	0m0.033s

real	0m32.679s
user	0m32.652s
sys	0m0.023s

real	0m32.684s
user	0m32.666s
sys	0m0.014s

real	0m32.754s
user	0m32.724s
sys	0m0.026s

real	0m32.666s
user	0m32.653s
sys	0m0.010s

real	0m32.744s
user	0m32.723s
sys	0m0.017s

real	0m32.714s
user	0m32.684s
sys	0m0.026s

real	0m32.730s
user	0m32.689s
sys	0m0.037s

deconstruct60 · Dec 3, 2012

bernuli said:
I am not sure that faster clock speed directly relates to faster time.

Since the core of your benchmark primarily consists of just doing an addition operation.

i := i + 1

it is basically going to be driven by how long it takes to do that "instruction" in Perl. There is some overhead in starting up the perl runtime (and shutting it down at the end ), but by in large your benchmark primarily just consists of that single expression. Adding a single variable to a single literal small number.

On most modern processors the addition instruction takes about one clock. The loop branching instructions you have wrapped around this expression will be just noise. The branch predictors will negate that impact by the 3rd-4th iteration of the loop. Since there is absolutely nothing inside the body of the loop the predictors will grab the calculation for the next iteration right away. So effectively the processor will sequentially execute the above expression.

Maybe I should look at the spec sheets, but there is lots of info in those sheets,

It has very little to do with specs. It has much more do with understanding what the program does. Namely, nothing substantive. You are basically asking the processor to do 1st or 2nd grade simple math. That typically happens at approximately clock speed.

bernuli · Dec 3, 2012

Wow, thanks for the response! So what you are saying is nothing happens in this loop, so the iteration is complete in no time. Then the program actually has to wait for the next clock cycle to do the next iteration? Makes sense.

The only thing I don't understand is my MBP, 2.5 GHz Core 2 Duo does it in 48 seconds. My new mini is 2.3 GHz i5 and does it in 32 seconds.

So it is faster with a slower clock rate. Is that because of the turbo to 2.9 which the spec sheet (i5-2415M) talks about?

Thanks again for the explanation.

B

deconstruct60 said:
Since the core of your benchmark primarily consists of just doing an addition operation.

i := i + 1

it is basically going to be driven by how long it takes to do that "instruction" in Perl. There is some overhead in starting up the perl runtime (and shutting it down at the end ), but by in large your benchmark primarily just consists of that single expression. Adding a single variable to a single literal small number.

On most modern processors the addition instruction takes about one clock. The loop branching instructions you have wrapped around this expression will be just noise. The branch predictors will negate that impact by the 3rd-4th iteration of the loop. Since there is absolutely nothing inside the body of the loop the predictors will grab the calculation for the next iteration right away. So effectively the processor will sequentially execute the above expression.

It has very little to do with specs. It has much more do with understanding what the program does. Namely, nothing substantive. You are basically asking the processor to do 1st or 2nd grade simple math. That typically happens at approximately clock speed.

Inconsequential · Dec 4, 2012

This is more relevant than your single line of code:

http://www.primatelabs.com/blog/2012/11/imac-215-late-2012-benchmarks/

The 6-core 3.33ghz is fastest for single thread, then the 12-cores are fastest for apps that can take advantage of so many cores.

GermanyChris · Dec 4, 2012

Here's mine

bernuli · Dec 4, 2012

GermanyChris said:
Here's mine

19 seconds? What is that on? Fastest I have seen is 25.55 on the new iMac 2.9 GHz Intel Core i5.

B

All Taken · Dec 4, 2012

bernuli said:
19 seconds? What is that on? Fastest I have seen is 25.55 on the new iMac 2.9 GHz Intel Core i5.

B

He's running a hackintosh.

GermanyChris · Dec 4, 2012

yes..

I'd guess close to 30 on my MP.

deconstruct60 · Dec 5, 2012

bernuli said:
.... My new mini is 2.3 GHz i5 and does it in 32 seconds.

So it is faster with a slower clock rate. Is that because of the turbo to 2.9 which the spec sheet (i5-2415M) talks about?

It is not a slower clock rate. Pragmatically with the current AMD/Intel offerings the 'base'/nominal clock rate is just an indication of normal lower bound when Turbo cannot be leveraged. It is not the speed at which will see on average. The CPU will operate over a range of frequencies between this lower and the "max" turbo speeds. The frequency can be automatically adjusted over 30 times during these 30s runs. There is no single clock rate it will run at. If you want a rough approximate estimate pick halfway between the two.

Intel and AMD don't spec an average speed because what the average is will highly depend upon your workload. Simplistic stuff like this benchmark would be closer to max than min.

If there are multiple micro-architecture generations then that will have a smaller impact too (faster memory ) in this narrow "do a trillion additions" context.

GermanyChris · Dec 5, 2012

bernuli said:
19 seconds? What is that on? Fastest I have seen is 25.55 on the new iMac 2.9 GHz Intel Core i5.

B

It was the HMMWV
2700k@4.6
560Ti 448
32GB DDR3 1600
10.8.2

Lance-AR · Dec 5, 2012

Mid 2011 Mini i5 2.3Ghz
real 0m30.966s
user 0m30.851s
sys 0m0.070s

GermanyChris · Dec 6, 2012

Here is the 8 core MP 1,1

speed comparison within current Pro lineup

macrumors 6502a

macrumors 601

macrumors 6502

macrumors 6502a

macrumors G5

macrumors G5

macrumors 6502a

macrumors 6502

macrumors G5

macrumors 6502a

macrumors 68000

macrumors 601

Attachments

macrumors 6502a

macrumors 6502a

macrumors 601

macrumors G5

macrumors 601

macrumors 6502

macrumors 601

Attachments

Our Staff