Intel Recommends Developers Plan for Massive Multi-Core Processing

ironjaw · Jul 7, 2008

Two Word...

jrichman63 said:
OK I'm getting the next gen MAC PRO 16 CORES!

Mac Pro 16 Core, Big Computer, Small dick

All I can say is Handbrake, baby

jragosta · Jul 7, 2008

koala said:
http://en.wikipedia.org/wiki/Amdahl's_law

Great background, but it misses the point. What Intel is saying (and what Grand Central seems to go after) is increasing the percentage that is accelerated by multiple cores.

DOS had zero speed up from multiple cores, so the speed gain was 0. As MP systems became available, a few very CPU intensive tasks were written to use multiple processors. More things are written that way now. Intel is saying that MOST things should be written in such a way as to benefit from multiple processors. The closer Grand Central gets to that ideal, the better for Apple.

dagamer34 said:
The problem is that programmers very quickly will hit a wall where tasks cannot be run in parallel because they depend on data output from one another. There can only be so many parallel tasks running at once for a program, and I hardly think that this will scale to utilizing hundreds or thousands of cores.

Only partially true. If properly written, much of the needed data will be in cache and accessible all the time. That's clearly one of the things that Intel wants people keeping in mind. But it really depends on what percentage of the time the processor NEEDS to be waiting for data.

Nonetheless, Intel's point is that you need to make everything that could possibly benefit work with massive numbers of processors. If there's something that can't benefit, then it can't be helped, but that's no excuse for failing to do what you can.

MacGohil said:
This seems like a grand concept but until the developers can exploit the full potential of these multicores there is no point investing in such high-end multi-pro systems.

That is not true. First, almost every real world computer program has SOME tasks that can be sped up. Second, even if you have a task which is truly linear with no possible gain from MP, extra cores still allow you to run other tasks at full speed without impacting the speed of the first task.

Amdahl said:
There is Amdahl's Law, which puts a limit on how many you can use effectively.

You're again missing the key point. Amdahl's Law gives the limit in speed gain for a given level of 'parallelizability'. If you change the percentage of tasks which benefit from multiple cores, the speed gain improves. Sure, if 5% of the tasks are truly impossible to make parallel, then we're limited to 20x speedup. I guess I can live with that. But Intel and Apple's point is that current software design is possibly making 50% of the code not benefit from MP when if the software were coded differently that percentage would drop - improving the speed gains.

The real lesson of Amdahl's Law is not the limit to the speed gains. It is that you need to design your code so that only a very small percentage does not use multiple processors.

Amdahl said:
Hardware designers will also experience difficulties creating memory switch architectures that can handle 16+ cores accessing the same memory.

Which is why Intel is giving them advance warning.

dwsolberg said:
I appreciate the Wiki entry on Amdahl's law, but the page has a serious flaw: It assumes that the smallest possible chunk the task can be divided into is 1/20 (95% of a task). Naturally, that corresponds to a (almost) maximum speed increase of 20x. If a task could be divided into 100 chunks, then the maximum speedup would be 100x using the formula.

Here's a list of different sized chunks along with the corresponding speedup using 512 processors:
2000 chunks = 407x speedup
1000 chunks = 338x speedup
500 chunks = 144x speedup
100 chunks = 83x speedup
50 chunks = 45x speedup
20 chunks = 19x speedup

Clearly, the big challenge is how to solve the problem of making the chunks smaller. It seems impossible, but I doubt that it is.

Exactly- finally someone gets it. The whole point of Intel's announcement is getting developers thinking differently so that a much greater percentage of their code is written in small MP capable chunks.

Amdahl said:
Indeed, they do. But that doesn't change the fact that they need to sell chips, and the only chips they can make are the kind that need parallel code to be useful. So they are going to sell them until people don't want them, or a new use for them is discovered. Today, they are useless.

This is silly. Ask anyone who uses several apps at the same time if multicore processors are useless. Or a scientist or engineer doing heavy computations.

iceman1234 said:
-unless the program you are using is optimized for duo, quad, "octagonal" or... cores, you processing speeds will not drastically improve with more cores.
-this is half the reason why windows runs like a dog; windows is, and will never be optimized for each and every computer configuration.

Which is exactly the point of Intel's announcement and Grand Central. It's time for developers to start thinking in terms of large numbers of cores.

CWallace · Jul 7, 2008

Intel does make a better chip line - it's called Itanium. The problem with that line is that it doesn't natively run x86 code in hardware so performance is poor. It also needs very strong code compiling so all the sloppy programming now done on the x86 platform just wouldn't work when coding for Itanium which makes application development more expensive to code in both time and money.

Intel and PPCA both hit the wall with raw clockspeed with NetBurst and POWER4 and have both moved to multi-core CPUs for their successor generations. POWER6 is posting impressive clockspeed numbers, but folks are getting some crazy numbers out of Core processors, as well.

And AMD can't find their arse with both hands right now. They're praying for a sugar-daddy to come and bail them out so they can pay the bills. Their ASPs are being gutted by Intel and they offer nothing in the same league as Intel's top-tier CPUs, so Intel can cut their AMD-equivalent CPU prices to bleed them and still generate shedloads of cash on their high-end CPUs through high prices. They've had a nightmare of a time moving to 45nm (same with their move to 65nm), but on the plus side their 45nm chips can work with both DDR2 and DDR3 so folks can drop them in their existing AM2+ motherboards and do not have to move to AM3 which should help adoption.

immaculate · Jul 7, 2008

the real question

Yes, but will Safari feel snappier?

slackpacker · Jul 7, 2008

We have hit a 3-4Ghz speed bump with Processors. So what do they do, They make more or them. But our current OS and Programs don't support multi-Core.

So its July 2008, Multicore processors have been out for about 3 Years.... and we are finally getting around to supporting this. I am so glad that Apple finally saw the writing on the wall.

As far as I'm concerned we are 3 Years Late on this. It should have been implimented in Tiger. The Day the first multicore/Intel system came out. Its not just Apple to Blame here.... Its Intel.... Why did you wait so long....They were selling SNAKE OIL all this time.

diamond.g · Jul 7, 2008

CWallace said:
Intel does make a better chip line - it's called Itanium. The problem with that line is that it doesn't natively run x86 code in hardware so performance is poor. It also needs very strong code compiling so all the sloppy programming now done on the x86 platform just wouldn't work when coding for Itanium which makes application development more expensive to code in both time and money.

Intel and PPCA both hit the wall with raw clockspeed with NetBurst and POWER4 and have both moved to multi-core CPUs for their successor generations. POWER6 is posting impressive clockspeed numbers, but folks are getting some crazy numbers out of Core processors, as well.

And AMD can't find their arse with both hands right now. They're praying for a sugar-daddy to come and bail them out so they can pay the bills. Their ASPs are being gutted by Intel and they offer nothing in the same league as Intel's top-tier CPUs, so Intel can cut their AMD-equivalent CPU prices to bleed them and still generate shedloads of cash on their high-end CPUs through high prices. They've had a nightmare of a time moving to 45nm (same with their move to 65nm), but on the plus side their 45nm chips can work with both DDR2 and DDR3 so folks can drop them in their existing AM2+ motherboards and do not have to move to AM3 which should help adoption.

So in the end it almost looks like IBM/Toshiba/Sony made the right choice with Cell. Intel will find that the Cell design will be much easier to scale.

jragosta · Jul 7, 2008

slackpacker said:
We have hit a 3-4Ghz speed bump with Processors. So what do they do, They make more or them. But our current OS and Programs don't support multi-Core.

So its July 2008, Multicore processors have been out for about 3 Years.... and we are finally getting around to supporting this. I am so glad that Apple finally saw the writing on the wall.

As far as I'm concerned we are 3 Years Late on this. It should have been implimented in Tiger. The Day the first multicore/Intel system came out. Its not just Apple to Blame here.... Its Intel.... Why did you wait so long....They were selling SNAKE OIL all this time.

Why is Intel to blame? They released multicore processors as fast as they were able to make them.

As for 'snake oil', that's nonsense. There is clearly a benefit to multicore processors even today. Not as much as they are theoretically capable of, but users already benefit significantly - particularly people who run more than one app at a time.

slackpacker · Jul 7, 2008

jragosta said:
Why is Intel to blame? They released multicore processors as fast as they were able to make them.

As for 'snake oil', that's nonsense. There is clearly a benefit to multicore processors even today. Not as much as they are theoretically capable of, but users already benefit significantly - particularly people who run more than one app at a time.

Snake Oil -The expression is also applied to any product with exaggerated marketing but questionable or unverifiable quality.

While the Intel cpu's are great they have been promoting something that the Programers and the OS providers have not fully taken advantage of.

dwsolberg · Jul 7, 2008

Smaller chunks

Amdahl said:
No, you're just misunderestimating the problem, because the Wikipedia graphic is using an exponentially compressed X axis, but a standard Y axis. The problem is the 5% part that can't be parallelized, not how many chunks you slice the 95% that can be parallelized. If 5% can't be parallelized, then a 20x speedup is the limit of performance, with tremendous number of cores needed to get that last 1x of speedup. Practically speaking, you would stop at 64 cores for a 15.5x speedup.

This graphic demonstrates a problem where 90% is paralellizable, and 10% is serial. Maximum speedup is about 6x, with seriously diminishing returns at 8 cores (~4.75x speedup), and 9.17x speedup if you go to 100 cores: 'Linear' is perfectly parallelized.

You're certainly right about what you say, and I was unclear. 95% corresponds to 20 chunks in parallel (the 5% remaining represents the largest "chunk" that will run in parallel, and 5% goes into 100% 20 times).

99% in parallel means that 1% is remaining, and 1% goes into 100% 100 times, which corresponds to 100 chunks in parallel.

This effect is seen in the formula: 1 / ( (P-1) x (P/N) ), where P = the largest percent (in the above cases 95% or 99%) and N = the number of processors. With a suitably large number of processors, the 95% tops out at 20 times increase and the 99% tops out at 100 times increase. Both these increases correspond directly to the number of "chunks" that run in parallel. The larger the percent, the smaller the chunks and the faster the speedup.

I'm sorry for the confusing earlier post, and I hope this clarifies my point a little. (My point is that smart software engineers need to figure out how to increase the number of parallel processes ("chunks") that can be run, which in turn implies making the chunks smaller.)

jragosta · Jul 7, 2008

slackpacker said:
Snake Oil -The expression is also applied to any product with exaggerated marketing but questionable or unverifiable quality.

Yes, I know what snake oil is. I'm just saying that it doesn't apply to Intel. The core duo advertising says it has 2 processors. It does. Intel says that it will speed up many tasks. It does (some tasks are very close to twice as fast).

If Intel had claimed that it would make everything twice as fast, then you might have a complaint. But as it is, Intel's duo (and quad) processors are exactly what Intel claimed.

Mac Player · Jul 7, 2008

Most of the users don't need more cores they just need better software.

I predict than with 1k cores office will be slower than office 98 on a 333 G3.

Amdahl · Jul 7, 2008

jragosta said:
Great background, but it misses the point. What Intel is saying (and what Grand Central seems to go after) is increasing the percentage that is accelerated by multiple cores.

Intel & AMD have both been saying this for at least three years now. Nothing has changed, including that software still has trouble taking advantage of multiple cores.

The real lesson of Amdahl's Law is not the limit to the speed gains. It is that you need to design your code so that only a very small percentage does not use multiple processors.

The design of the code is dictated by the problem being solved; unless you've got a parallel problem, you're stuck.

Which is why Intel is giving them advance warning.

No, Intel is the hardware designer.. they are going to have trouble creating systems with multiple cores that can access memory quickly enough to keep the all the cores running.

This is silly. Ask anyone who uses several apps at the same time if multicore processors are useless. Or a scientist or engineer doing heavy computations.

How about no one cares? How about I ask a Wal-Mart shopper what they think, because that is Intel's customer. In fact, it is kind of funny watching computer makers try to sell multicore by saying you can run multiple apps now.. As if you couldn't do that the last 15 years!

Which is exactly the point of Intel's announcement and Grand Central. It's time for developers to start thinking in terms of large numbers of cores.

Oh, NOW is the time? A year and a half ago, NSOperation was the hot multicore savior in Leopard. I said it would be useless, even without knowing anything other than it was an API. I was right. Now, Grand Central is the multicore savior, even though no one knows what it is. I'll tell you this much: it is nothing that isn't already known, and that means if the fix for your problem isn't already out there, Apple isn't going to change that.

slackpacker · Jul 7, 2008

Amdahl said:
Intel & AMD have both been saying this for at least three years now. Nothing has changed, including that software still has trouble taking advantage of multiple cores.

The design of the code is dictated by the problem being solved; unless you've got a parallel problem, you're stuck.

No, Intel is the hardware designer.. they are going to have trouble creating systems with multiple cores that can access memory quickly enough to keep the all the cores running.

How about no one cares? How about I ask a Wal-Mart shopper what they think, because that is Intel's customer. In fact, it is kind of funny watching computer makers try to sell multicore by saying you can run multiple apps now.. As if you couldn't do that the last 15 years!

Oh, NOW is the time? A year and a half ago, NSOperation was the hot multicore savior in Leopard. I said it would be useless, even without knowing anything other than it was an API. I was right. Now, Grand Central is the multicore savior, even though no one knows what it is. I'll tell you this much: it is nothing that isn't already known, and that means if the fix for your problem isn't already out there, Apple isn't going to change that.

What I'm hoping GrandCentral is.... is a bucket where single treaded programs can point to. From there GrandCentral will divide up the processes.

Mac Player · Jul 7, 2008

:\ Do you think you will code a single thread app and GrandCentral will "magically distribute it"? Sry but NO.

jragosta · Jul 7, 2008

Amdahl said:
Intel & AMD have both been saying this for at least three years now. Nothing has changed, including that software still has trouble taking advantage of multiple cores.

SOME software has trouble. Other software uses multiple cores just fine.

Amdahl said:
The design of the code is dictated by the problem being solved; unless you've got a parallel problem, you're stuck.

Partly true. But the point is that with a little effort, a lot of things can be parallelized that in the past were not.

Amdahl said:
No, Intel is the hardware designer.. they are going to have trouble creating systems with multiple cores that can access memory quickly enough to keep the all the cores running.

And your qualifications for knowing more about this than all the engineers at Intel are.....?

Amdahl said:
How about no one cares? How about I ask a Wal-Mart shopper what they think, because that is Intel's customer. In fact, it is kind of funny watching computer makers try to sell multicore by saying you can run multiple apps now.. As if you couldn't do that the last 15 years!

So since the average Wal-mart shopper doesn't know what multiple cores are, then no one cares? What an inane argument. Lots of people do need multiple cores.

And no one says that the ability to run multiple apps is new. But multiple cores allow you to run multiple apps AT FULL SPEED which was not the case before.

Amdahl said:
Oh, NOW is the time? A year and a half ago, NSOperation was the hot multicore savior in Leopard. I said it would be useless, even without knowing anything other than it was an API. I was right. Now, Grand Central is the multicore savior, even though no one knows what it is. I'll tell you this much: it is nothing that isn't already known, and that means if the fix for your problem isn't already out there, Apple isn't going to change that.

You keep ignoring that for many apps, the multiple cores ARE used - and sometimes used very well. Grand Central is designed to extend that capability further. No one is claiming that it's going to be magic or a 'savior' except blow hard know-nothings making straw man arguments.

What I am saying (and what Apple is saying and what Intel is saying and what most intelligent people are saying) is that as multiple cores become more ubiquitous, it needs to become standard to write apps that take advantage of them rather than limiting it to a small number of CPU-intensive apps.

Most people seem to understand that. Why can't you?

jragosta · Jul 7, 2008

Mac Player said:
:\ Do you think you will code a single thread app and GrandCentral will "magically distribute it"? Sry but NO.

I don't think anyone outside of Apple knows for sure.

Compilers are pretty smart - they can unwrap loops, for example, so the end result is somewhat different than the coder intended. Could a compiler automatically distribute some things? Possibly. Even more likely is a compiler issuing a warning saying that a piece of code is inefficient.

Also, it's possible that there COULD be some things that benefit from Grand Central without any changes in the code. For example, take the API that draws objects onto the screen. The app calls the API which draws the objects onto the screen and the API does its thing. If that API is now rewritten to use multiple cores, then the app would speed up even if the app didn't do anything different.

That said, I expect that while Apple will rewrite the APIs to give ALL apps some benefit, you are right that the bulk of the benefit will come from coders changing the way they do things. If Grand Central makes it easier to do so, it will accomplish its goals.

slackpacker · Jul 7, 2008

Mac Player said:
:\ Do you think you will code a single thread app and GrandCentral will "magically distribute it"? Sry but NO.

Well thats what I envision the Saviour of multi-core CPU technology...would be.

Amdahl · Jul 7, 2008

slackpacker said:
What I'm hoping GrandCentral is.... is a bucket where single treaded programs can point to. From there GrandCentral will divide up the processes.

That describes every real OS for the last 20 years.

Mac Player · Jul 7, 2008

You guys are dreaming unrolling loops is very different from parallelizing your app, I actually put that in the same class as visual programming (i.e. never delivered).

Amdahl · Jul 7, 2008

jragosta said:
And your qualifications for knowing more about this than all the engineers at Intel are.....?

Didn't say I know more. But I do know. You can't keep adding devices (cores) to a unified memory system and not have problems keeping those cores from stepping all over each other as they access a memory. Only a fool can't see that.

So since the average Wal-mart shopper doesn't know what multiple cores are, then no one cares? What an inane argument. Lots of people do need multiple cores.

No, the poster said engineers need multiple cores. Wal-Mart shoppers aren't engineers, hence their use of the computer does not benefit from multiple cores, as debated in this article: 100s or 1000s of cores.

And no one says that the ability to run multiple apps is new. But multiple cores allow you to run multiple apps AT FULL SPEED which was not the case before.

Dell & HP both sell it as new. And it was possible to run the majority of apps at full speed, even with a single core. Because most people barely use their computer. Running an MP3, an email app, and a web browser doesn't take much CPU.

You keep ignoring that for many apps, the multiple cores ARE used - and sometimes used very well. Grand Central is designed to extend that capability further. No one is claiming that it's going to be magic or a 'savior' except blow hard know-nothings making straw man arguments.

I guess you can include Apple in the category of know-nothing blow hards.

What I am saying (and what Apple is saying and what Intel is saying and what most intelligent people are saying) is that as multiple cores become more ubiquitous, it needs to become standard to write apps that take advantage of them rather than limiting it to a small number of CPU-intensive apps.

Most people seem to understand that. Why can't you?

Because what you don't understand is that the only benefactors of multiple cores are (BY definition) "a small number of CPU-intensive apps." Duh.

diamond.g · Jul 7, 2008

Mac Player said:
You guys are dreaming unrolling loops is very different from parallelizing your app, I actually put that in the same class as visual programming (i.e. never delivered).

That isn't entirely true. Sony/IBM have compilers that can unroll loops for Cell. Of course this is only helpful when writing code for in-order CPU's. If Intel were to actually switch over to in-order CPU's then maybe we will see compilers do a better job of that. But with OOO CPU's it isn't needed. Of course loop unrolling wont help with parallelization. But I am sure most here should have realized that.

And by loops I mean branches not while/for statements.

gnasher729 · Jul 7, 2008

Mac Player said:
:\ Do you think you will code a single thread app and GrandCentral will "magically distribute it"? Sry but NO.

What GrandCentral lets you do: Split up your code into small tasks, with possible dependencies between tasks, and when other applications do the same thing, GrandCentral will "magically" distribute all those tasks between various cores and virtual processors, better than an individual application could do because it doesn't have the overview of the whole system.

You have to do the "splitting" for your application yourself, but GrandCentral can then optimise the throughput for the whole system.

Amdahl · Jul 7, 2008

gnasher729 said:
What GrandCentral lets you do: Split up your code into small tasks, with possible dependencies between tasks, and when other applications do the same thing, GrandCentral will "magically" distribute all those tasks between various cores and virtual processors, better than an individual application could do because it doesn't have the overview of the whole system.

You have to do the "splitting" for your application yourself, but GrandCentral can then optimise the throughput for the whole system.

In other words, they are going to junk the 20+ year old Mach scheduler and write a modern one.

Amdahl · Jul 7, 2008

diamond.g said:
If Intel were to actually switch over to in-order CPU's then maybe we will see compilers do a better job of that. But with OOO CPU's it isn't needed. Of course loop unrolling wont help with parallelization. But I am sure most here should have realized that.

Actually, I think there is a good chance that Intel will take the OOO off their CPUs as the core-count rises. OOO makes sense when you've only got one pipeline (figuratively) and you've got to make the most of it. It doesn't make sense when you've got 100. All those OOO just become wasted transistors; you could have 128 cores instead.

Is it possible to delete a post here? I would have liked to merge this post with the one above.

dagamer34 · Jul 7, 2008

Intel's future chips aren't going to make current apps faster, but will make future apps crumble on current chips.

Classic Business 101 tactic: "If there isn't a market for your product... make one!"

Intel Recommends Developers Plan for Massive Multi-Core Processing

macrumors 6502

macrumors 6502a

macrumors G5

macrumors member

macrumors 6502a

macrumors G5

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors regular

macrumors 65816

macrumors 6502a

macrumors regular

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 65816

macrumors regular

macrumors 65816

macrumors G5

Suspended

macrumors 65816

macrumors 65816

macrumors 65816

Our Staff