speed comparison within current Pro lineup

bernuli

macrumors 6502a
Original poster
Oct 10, 2011
584
326
Hello,

I was wondering if anyone who has access to a new mac pro with the dual processor configuration or 3.33GHz single recently could run a command for me.

The Apple Stores have in stock the single CPU 3.2GHz Quad-Core Intel Xeon. So I am able to test that configuration. The following command completes in 29.6 seconds on the low end Mac Pro.

uptime; time perl -e 'for (0 .. 1000000000) {}'

I would love to see how long this takes on the other available configurations.


B
 

derbothaus

macrumors 601
Jul 17, 2010
4,060
4
6-core 3.33GHz, 12GB Memory

1st run:
real 0m30.146s
user 0m30.115s
sys 0m0.022s

2nd run:
real 0m29.558s
user 0m29.552s
sys 0m0.004s

No amazing insight I can see. Single thread crank.
 
Comment

thepawn

macrumors 6502
May 27, 2009
413
7
On my 2009 2.66ghz Quad:

13:02 up 23 days, 10 hrs, 2 users, load averages: 0.42 0.38 0.31
real 0m37.728s
user 0m37.477s
sys 0m0.006s
Tron:~ daniel$
 
Comment

bernuli

macrumors 6502a
Original poster
Oct 10, 2011
584
326
Not super amazing on its own.

But if you run 2 of these at the same time they should complete in the same amount of time. The store Mac Pro is a quad core and 4 at a time all complete in 30.38 seconds. When do 5 at a time, it slows to 37.07 seconds and down for there of course.

Could you do 5 at a time, then 6 then 7 for me? You can do multiple terminal windows or run the command with an & on the end so:

time perl -e 'for (0 .. 1000000000) {}'&

then press the up arrow and return 5 times quick


B
6-core 3.33GHz, 12GB Memory

1st run:
real 0m30.146s
user 0m30.115s
sys 0m0.022s

2nd run:
real 0m29.558s
user 0m29.552s
sys 0m0.004s

No amazing insight I can see. Single thread crank.
 
Comment

deconstruct60

macrumors G3
Mar 10, 2009
8,635
1,554
The following command completes in 29.6 seconds on the low end Mac Pro.

uptime; time perl -e 'for (0 .. 1000000000) {}'

I would love to see how long this takes on the other available configurations.
Not sure why. It is basically going to tell you what the clock speed is of the CPU. You can find that on a data sheet. All you are doing is loading a loop up inside of the L1 cache and executing it. The vast majority of normal apps execute outside of the L1/L2/l3 cache. That's where the Mac Pro is differentiated; I/O outside of memory.

The Mac Mini that turbos up to 3.6GHz is probably the best bang-for-the-buck for this benchmark.


If the perl optimizer had any brains it would take almost 0.00 ms since it doesn't do anything. The whole loop can be optimized away.
 
Comment

deconstruct60

macrumors G3
Mar 10, 2009
8,635
1,554
Could you do 5 at a time, then 6 then 7 for me? You can do multiple terminal windows or run the command with an & on the end so:
5 at a time only expands it to a L2/L3 cache problem. Still just measuring clock rate.
 
Comment

bernuli

macrumors 6502a
Original poster
Oct 10, 2011
584
326
Not sure why. It is basically going to tell you what the clock speed is of the CPU. You can find that on a data sheet. All you are doing is loading a loop up inside of the L1 cache and executing it. The vast majority of normal apps execute outside of the L1/L2/l3 cache. That's where the Mac Pro is differentiated; I/O outside of memory.

The Mac Mini that turbos up to 3.6GHz is probably the best bang-for-the-buck for this benchmark.


If the perl optimizer had any brains it would take almost 0.00 ms since it doesn't do anything. The whole loop can be optimized away.
I am not sure that faster clock speed directly relates to faster time. Maybe I should look at the spec sheets, but there is lots of info in those sheets, and I am trying to get a real world speed comparison. Though you seem to be pointing out that this test is not exactly real world.


B
 
Comment

Sirobin

macrumors 6502
May 6, 2008
340
20
California
This is on my 2012 12 core (2.66 Ghz) with 20 GB of RAM, all running at the same time.

Code:
real	0m32.695s
user	0m32.673s
sys	0m0.018s

real	0m32.751s
user	0m32.715s
sys	0m0.033s

real	0m32.679s
user	0m32.652s
sys	0m0.023s

real	0m32.684s
user	0m32.666s
sys	0m0.014s

real	0m32.754s
user	0m32.724s
sys	0m0.026s

real	0m32.666s
user	0m32.653s
sys	0m0.010s

real	0m32.744s
user	0m32.723s
sys	0m0.017s

real	0m32.714s
user	0m32.684s
sys	0m0.026s

real	0m32.730s
user	0m32.689s
sys	0m0.037s
 
Comment

deconstruct60

macrumors G3
Mar 10, 2009
8,635
1,554
I am not sure that faster clock speed directly relates to faster time.
Since the core of your benchmark primarily consists of just doing an addition operation.

i := i + 1

it is basically going to be driven by how long it takes to do that "instruction" in Perl. There is some overhead in starting up the perl runtime (and shutting it down at the end ), but by in large your benchmark primarily just consists of that single expression. Adding a single variable to a single literal small number.

On most modern processors the addition instruction takes about one clock. The loop branching instructions you have wrapped around this expression will be just noise. The branch predictors will negate that impact by the 3rd-4th iteration of the loop. Since there is absolutely nothing inside the body of the loop the predictors will grab the calculation for the next iteration right away. So effectively the processor will sequentially execute the above expression.



Maybe I should look at the spec sheets, but there is lots of info in those sheets,
It has very little to do with specs. It has much more do with understanding what the program does. Namely, nothing substantive. You are basically asking the processor to do 1st or 2nd grade simple math. That typically happens at approximately clock speed.
 
Last edited:
Comment

bernuli

macrumors 6502a
Original poster
Oct 10, 2011
584
326
Wow, thanks for the response! So what you are saying is nothing happens in this loop, so the iteration is complete in no time. Then the program actually has to wait for the next clock cycle to do the next iteration? Makes sense.

The only thing I don't understand is my MBP, 2.5 GHz Core 2 Duo does it in 48 seconds. My new mini is 2.3 GHz i5 and does it in 32 seconds.

So it is faster with a slower clock rate. Is that because of the turbo to 2.9 which the spec sheet (i5-2415M) talks about?

Thanks again for the explanation.


B


Since the core of your benchmark primarily consists of just doing an addition operation.

i := i + 1

it is basically going to be driven by how long it takes to do that "instruction" in Perl. There is some overhead in starting up the perl runtime (and shutting it down at the end ), but by in large your benchmark primarily just consists of that single expression. Adding a single variable to a single literal small number.

On most modern processors the addition instruction takes about one clock. The loop branching instructions you have wrapped around this expression will be just noise. The branch predictors will negate that impact by the 3rd-4th iteration of the loop. Since there is absolutely nothing inside the body of the loop the predictors will grab the calculation for the next iteration right away. So effectively the processor will sequentially execute the above expression.





It has very little to do with specs. It has much more do with understanding what the program does. Namely, nothing substantive. You are basically asking the processor to do 1st or 2nd grade simple math. That typically happens at approximately clock speed.
 
Comment

Inconsequential

macrumors 68000
Sep 12, 2007
1,977
1
This is more relevant than your single line of code:

http://www.primatelabs.com/blog/2012/11/imac-215-late-2012-benchmarks/

The 6-core 3.33ghz is fastest for single thread, then the 12-cores are fastest for apps that can take advantage of so many cores.
 
Comment

deconstruct60

macrumors G3
Mar 10, 2009
8,635
1,554
.... My new mini is 2.3 GHz i5 and does it in 32 seconds.

So it is faster with a slower clock rate. Is that because of the turbo to 2.9 which the spec sheet (i5-2415M) talks about?
It is not a slower clock rate. Pragmatically with the current AMD/Intel offerings the 'base'/nominal clock rate is just an indication of normal lower bound when Turbo cannot be leveraged. It is not the speed at which will see on average. The CPU will operate over a range of frequencies between this lower and the "max" turbo speeds. The frequency can be automatically adjusted over 30 times during these 30s runs. There is no single clock rate it will run at. If you want a rough approximate estimate pick halfway between the two.

Intel and AMD don't spec an average speed because what the average is will highly depend upon your workload. Simplistic stuff like this benchmark would be closer to max than min.


If there are multiple micro-architecture generations then that will have a smaller impact too (faster memory ) in this narrow "do a trillion additions" context.
 
Comment
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.