Let me say this again:
WHAT do you need multiple processors for? There are many reasons with many solutions. The problem with mac people (regardless of how much i love you all) is they stop at the box in front of them. There is this concept that the boys from Bell and MIT were messing around with a while back, its called networking. Have a look at the real MP systems like the SGI Origin series servers and you will notice that they don't waste millions of bucks designing quad (or higher) processor system boards because they solved the problem of bandwidth in the interconecting fabric between the CPU cards. And yes each CPU can snoop each others cache so stop rabiting on about L3 cache and start looking at branch prediction cache instead.
Now i will admit i need to do some homework on the new Opteron and how it functions but i have to remind all and sundry that the POWER4 and its pedigree the G5 are superscalar RISC CPU's and not instruction heavy like the Intel Plentym's. In other words there is literally less requirement for cache size which allows for more steps in the pipe. What i dont know is if the branch prediction can re-order the cache mid-flight or execute instructions out of order like a MIPS R12K ... Ill get back to you on that. The point is if CISC CPU (x) needs to do 120 cycles to figure out one operation and RISC CPU (y) needs only to do 5 or 6 as it reorders its cache then you do the math on which CPU wins the little fluffy toy.
Now lets put this all back together. If you have a network of computers and you are running a network operating system with a network file system (lets say, um let me think, BSD) and you run it all on lots of small, fast RISC CPU's then you have one large computer that has as many processors as you would like. THE ISSUE IS APPLICATION! or as i said before WHY.
WHAT do you need multiple processors for? There are many reasons with many solutions. The problem with mac people (regardless of how much i love you all) is they stop at the box in front of them. There is this concept that the boys from Bell and MIT were messing around with a while back, its called networking. Have a look at the real MP systems like the SGI Origin series servers and you will notice that they don't waste millions of bucks designing quad (or higher) processor system boards because they solved the problem of bandwidth in the interconecting fabric between the CPU cards. And yes each CPU can snoop each others cache so stop rabiting on about L3 cache and start looking at branch prediction cache instead.
Now i will admit i need to do some homework on the new Opteron and how it functions but i have to remind all and sundry that the POWER4 and its pedigree the G5 are superscalar RISC CPU's and not instruction heavy like the Intel Plentym's. In other words there is literally less requirement for cache size which allows for more steps in the pipe. What i dont know is if the branch prediction can re-order the cache mid-flight or execute instructions out of order like a MIPS R12K ... Ill get back to you on that. The point is if CISC CPU (x) needs to do 120 cycles to figure out one operation and RISC CPU (y) needs only to do 5 or 6 as it reorders its cache then you do the math on which CPU wins the little fluffy toy.
Now lets put this all back together. If you have a network of computers and you are running a network operating system with a network file system (lets say, um let me think, BSD) and you run it all on lots of small, fast RISC CPU's then you have one large computer that has as many processors as you would like. THE ISSUE IS APPLICATION! or as i said before WHY.