Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

compute

macrumors regular
Original poster
Jun 11, 2013
125
13
I was wondering since there was not a single word about 8 and 12 core machines, neither are they listed on the MacPro site.

The quad core and 6 core have "Available in december". No mention of 8 or 12 cores...if they would also be available by december wouldn't they have placed it on line also with an available in december tag.

Do you guys think this means that it could still take months for the 8 or 12 core since its not even mentioned?
 

jetjaguar

macrumors 68040
Apr 6, 2009
3,553
2,319
somewhere
I was wondering since there was not a single word about 8 and 12 core machines, neither are they listed on the MacPro site.

The quad core and 6 core have "Available in december". No mention of 8 or 12 cores...if they would also be available by december wouldn't they have placed it on line also with an available in december tag.

Do you guys think this means that it could still take months for the 8 or 12 core since its not even mentioned?

they can be configured that way .. you just can't see the prices .. go under tech specs and you will see
 

Umbongo

macrumors 601
Sep 14, 2006
4,934
55
England
I was wondering since there was not a single word about 8 and 12 core machines, neither are they listed on the MacPro site.

The quad core and 6 core have "Available in december". No mention of 8 or 12 cores...if they would also be available by december wouldn't they have placed it on line also with an available in december tag.

Do you guys think this means that it could still take months for the 8 or 12 core since its not even mentioned?

There aren't 8 and 12 core "machines", those are just build to order options.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,298
3,893
I was wondering since there was not a single word about 8 and 12 core machines, neither are they listed on the MacPro site.

Eh?

"... Configurable to 3.5GHz 6-core processor with 12MB L3 cache, 3.0GHz 8-core processor with 25MB L3 cache, or 2.7GHz 12-core processor with 30MB L3 cache ... "
http://www.apple.com/mac-pro/specs/

Frankly, on the performance page

http://www.apple.com/mac-pro/performance/


The 12 core is only configuration they benchmark against (see footnotes at the bottom of page ). I think the absence of 4 or 6 "old vs new" benchmarks actually says more and those configurations are listed.


I doubt they want to put the prices up because they are going to be dramatically higher. That is also going to mean that fewer folks are going to buy them too. Hence, they won't be stocked in retail stores or at several 3rd party sellers.

Probably > $2.2K over base to move to 8 cores and > $3.3K to move to 12 cores.
 

slughead

macrumors 68040
Apr 28, 2004
3,107
237
Probably > $2.2K over base to move to 8 cores and > $3.3K to move to 12 cores.

Out of curiosity, how much of a performance boost do you get from having 12 cores on 1 die as opposed to dual hex cores?

I wonder if it'd be worth losing the 40 extra lanes of PCIe and paying so much more.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,298
3,893
Out of curiosity, how much of a performance boost do you get from having 12 cores on 1 die as opposed to dual hex cores?

There are more factors than just core count. It depends upon what the short paths to bottlenecks are. For this particular 12 core versus 6 the price is different largely because the base clock speed is much higher. E5 v2 2.7-3.5 GHz 12 core count versus two E5 v2 2.6-3.1 GHz 6 core count. Mixed workload where the user drifts into the upper half of the dynamic range the single 12 will likely do better (it has a wider dynamic range and higher memory throughput on a single channel. ).

For workloads more I/O bound with a larger RAM footprint then dual set up has twice the memory I/O throughput. A workload aggregating box typically has little problem using that throughput up. A single user temporarily staring at the screen deciding what to do next probably won't.


I wonder if it'd be worth losing the 40 extra lanes of PCIe and paying so much more.

Paying more is not a problem if get more. Is base clock speed for lanes a good trade off? For many yes.

However, if it is primarily costs and embarrassingly parallel workloads then it is really about 2 x 6 versus 2 x 10 cost differential. The single 12 core is a diversion from the real issue.
 

zerocool42

macrumors newbie
Apr 1, 2013
15
0
For workloads more I/O bound with a larger RAM footprint then dual set up has twice the memory I/O throughput. A workload aggregating box typically has little problem using that throughput up. A single user temporarily staring at the screen deciding what to do next probably won't.

Actually, workloads that rely heavily on interprocess communication, or use a large amount of working memory will benefit from the single 12 core configuration from an architecture point of view.

The dual processor Nehalem/Westmere mac pros are actually NUMA systems, however OSX is not NUMA aware so the system is run with NUMA disabled and the memory striped across each node. Imagine this like a RAID0, except with your memory addresses mapped over each node. This means that half of your memory accesses will take a latency hit as they have to cross the QPI bus and hit the other processors memory controller. The performance impact of this isn't huge, I've found about 6-12% doing normal workstation tasks. The advantage is though that any process, numa aware or not can use all of the available memory in the machine with predictable performance. The single die 12 core doesn't have this problem.

However, a low clocked 12 core is still going to lose to a higher clocked 2p 6 core config most of the time.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,298
3,893
Actually, workloads that rely heavily on interprocess communication, or use a large amount of working memory will benefit from the single 12 core configuration from an architecture point of view.

The dual processor Nehalem/Westmere mac pros are actually NUMA systems, however OSX is not NUMA aware so the system is run with NUMA disabled and the memory striped across each node.

OS X may not be NUMA aware but dual CPU package systems means at least 4 more DIMMs slots. For appliactions that use a large amount of working memory having a large amount of memory is more important than nuanced NUMA performance tweaks.

But yes for memory sizes that can be easily match by both the single and dual CPU package systems there are some minor advantages.



This means that half of your memory accesses will take a latency hit as they have to cross the QPI bus and hit the other processors memory controller.

The problem is that is somewhat negated in the E5 systems in that the two QPI links are used to hook the CPUs together. There is no QPI link to the IOHub chip anymore. There is twice as much bandwidth as before. So while yes you have to cross QPI... but there are twice as many QPI links engaged in delivery.

The 12 core chip also NUMA aspects internally.

OverviewIVB3dies_575px.png

[from anantech article on the 12 core http://www.anandtech.com/show/7285/intel-xeon-e5-2600-v2-12-core-ivy-bridge-ep ]

There are three internal loops there. Frankly, OS X slacker approach to dealing with NUMA will have small impact there too because not going to be uniformly even point-to-point accesses there either (e.g., some L3 cache accesses are not going to be same). Single CPU package doesn't mean you don't have to deal with NUMA issues. If Apple wants to keep their head buried in the sand long term about NUMA issues they have to stick solely with mainstream Intel CPUs. Otherwise they can kick the can down the road, but it is coming eventually.
 
Last edited:

zerocool42

macrumors newbie
Apr 1, 2013
15
0
deconstruct60 said:
The 12 core chip also NUMA aspects internally.

But the single socket IB-E system that Apple is selling is not a NUMA system, the OS only sees one memory node. Intel has been using the internal ring architecture since Sandy Bridge. While memory and cache latencies are different per core depending on where they sit on the ring, the difference here is measurable in clock cycles. Long, long, gone are the days of being able to predict the execution times of any code on x86 systems in terms of clock cycles anyway, the end result of this is relatively meaningless since.

Despite the increased QPI bandwidth my experience with the E5 platform has been that there is a greater performance hit than Nehalem had when needing to access memory between nodes on 2p systems, and the 4p is absolutely atrocious. I've only been using the 4p systems for virtualization hosts, which is what I think Intel was targeting with this architecture. NUMA awareness is absolutely required on that platform, which really limits their usefulness if you want to get the most out of them.

Apple made the right choice leaving the new Mac Pro 1p only this time.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,298
3,893
But the single socket IB-E system that Apple is selling is not a NUMA system, the OS only sees one memory node.

Non Uniforma Memory Access is Non Unifrom Memory Acess period. It technically does not have to do with number of sockets. Yes historically there is a correlation but Non Uniform is simply Non Uniform. If a design pushes non uniform access inside of a single package then it too will take on NUMA aspects that need to be addressed by the OS to achieve more optimal performance.

Apple made the right choice leaving the new Mac Pro 1p only this time.

That has more to do with 12 cores being 'enough for most people' and going to simpler more focused central infrastructure design (that is easier for them to do given the resources allocated). Long term though that doesn't necessary sweep slacker NUMA implemntaton issues under the rug though. It does allow them to kick the can down the road a bit longer though. The ever steadily increasing core count is going to surface the problem over the long term though.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.