Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
I have been thinking the same thing for a while now. Constraints on the hardware side will force the people writing software to be more inventive and efficient.

Having (virtually) unlimited resources in the desktop realm doesn't exactly give you a lot of incentive to reinvent the wheel every time you pump out a new update. Things tend to bloat instead of get rebuilt efficiently. Is there any reason why I should have to download 500MB updates to MS Office every time I open up a Word document on my iMac? Hardly seems efficient to me, and clearly not a viable option on iOS.

100% agreed. Dollar is the only metric that matters (http://bit.ly/j2N05x), and since CPU guys were willing to pay for crappy code, programmers had no incentive to fix it. Thanks to Apple, it is changing a bit now.
 
Really don't understand what you mean. Are you saying work will become less intensive, or processors will become faster+more-energy efficient? or are you saying software will become multi-threaded allowing it to leverage multiple energy-efficient cores to get performance, making it both fast and less energy?
It's LTD - he just spouts off stuff.

The whole idea of a "processor-intensive" application is that it typically loads up most or all cores to nearly full load (or keeps them at full load). As processor technology continues to advance, yes some processor-intensive applications become less-so due to the inherent architectural and/or speed enhancements made to the processor. However, as history has shown, when more-powerful systems become available, many developers strive to take advantage of it, so we'll always have "processor-intensive" applications in some form.

For the average Mac user? LTD can make a case, albeit weak. For quite a few professional Mac users (and many professional non-Mac users), those comments are just dribble.
 
I've read about the ARM since it's first use in the Newton. and in my understanding, the ARM is a pure RISC design, a very small core built with efficiency in mind. They don't have branch prediction and deep execution pipe like x86 processor, limiting their effective power in desktop environment. It's like comparing a regular 3L V6 engine with a 1.6 turbo V4 running at 11,000 RPM, both could achieve about the same HP. But the V6 can be push more ahead burning fuel and the V4 will have better fuel efficiency at low speed. While ARM is already push to it's limit, core multiplication and expending the base design of ARM can obliterate those limit in near future.

The interesting part come from Intel, saying right now ARM mobile CPU is growing twice as fast as the Moore Law.

Great analogy with cars. Love it.

A major clarification here: ARM is just an ISA. You can implement it with any microarchitecture. Branch prediction, deep pipelines, are all part of microarchitecture. 486 for example was x86 ISA but did not have branch prediction or deeper pipeline. Arm Cortex A15 (http://www.arm.com/products/processors/cortex-a/cortex-a15.php) does have deep pipeline and branch prediction. I will use your analogy. The difference between x86 and ARM is that of whats on the dash. Whats under the hood is independent of that.

The other things I want to point out is that ARM is not RISC. RISC was about simple instruction set, e.g., Alpha or SPARC was RISC. ARM is very bloated, they even have indexed register which even x86 doesn't support so thats a common misconception (I need to blog about this asap ...).

ARM is growing twice as fast because there is more room for improvement in terms of performance, but increasing performance will most certainly reduce their power-efficiency. I guess we will see where this ends up going. I have a feeling Intel and ARM will meet in the middle somewhere. Its good for us though, prices will come down due to competition:)

Most won't admit it, but Apple shaped the product road-map of most of those computer computer that's still relevant today. that includes Google, Sony, HTC, Samsung, Motorola etc...

Could not agree with you more. Not a fan boy but I have to admit that have revitalized innovation in our industry.

It's LTD - he just spouts off stuff.

The whole idea of a "processor-intensive" application is that it typically loads up most or all cores to nearly full load (or keeps them at full load). As processor technology continues to advance, yes some processor-intensive applications become less-so due to the inherent architectural and/or speed enhancements made to the processor. However, as history has shown, when more-powerful systems become available, many developers strive to take advantage of it, so we'll always have "processor-intensive" applications in some form.

For the average Mac user? LTD can make a case, albeit weak. For quite a few professional Mac users (and many professional non-Mac users), those comments are just dribble.

Agree with you once again. Give them more processing power, and programs figure out how to use them (sometimes to improve their efficiency by writing Java and others time providing more eye-candy and occasionally adding real value).
 
Last edited by a moderator:
On what basis do you say they are 10x more efficient? They are slow and hence burn less power and energy. There is nothing inherently inefficient about x86. ARM ISA is equally bloated.

Intel's fabrication edge is not easy to duplicate either. ARM can't use the same tricks in the same time frame. Intel has dedicated fabs tailored for their chips while ARMs built in these shared fabs like TSMC. TSMC stays 2 generations behind Intel in fab technology (and for good, fundamental economic reasons).

I give you right, my presumption was not totally right. Strictly talking about TDP, ARM are about 10x lower than best atom out there (less than 1 watt vs 10-15watts for Atom) and in term of performance per watt the x86 never was a great contender until they killed every other desktop alternative like PPC, DEC Alpha, MIPS, SPARC.

I agree with you, Intel got 1 or 2 generation ahead but even with Trigate breakthrough Intel is unable to lower x86 processor at ARM TDP level. And on that game ARM will still be the champ for long time.

Great analogy with cars. Love it.

A major clarification here: ARM is just an ISA. You can implement it with any microarchitecture. Branch prediction, deep pipelines, are all part of microarchitecture. 486 for example was x86 ISA but did not have branch prediction or deeper pipeline. Arm Cortex A15 (http://www.arm.com/products/processors/cortex-a/cortex-a15.php) does have deep pipeline and branch prediction. I will use your analogy. The difference between x86 and ARM is that of whats on the dash. Whats under the hood is independent of that.

The other things I want to point out is that ARM is not RISC. RISC was about simple instruction set, e.g., Alpha or SPARC was RISC. ARM is very bloated, they even have indexed register which even x86 doesn't support so thats a common misconception (I need to blog about this asap ...).

ARM is growing twice as fast because there is more room for improvement in terms of performance, but increasing performance will most certainly reduce their power-efficiency. I guess we will see where this ends up going. I have a feeling Intel and ARM will meet in the middle somewhere. Its good for us though, prices will come down due to competition:)

ARM ISA is a RISC design just like the PPC, Sparc or Alpha, the ARM acronyme stand for Acorn Risc Machine or Advance Risc Machine. Over time with Thumb, Neon, Jazelle, VFP it become bloated like you said, but the core is a straight simple RISC processor based on Berkley RISC projet.

ARM on wikipedia
 
Last edited:
As long as we can have less watts and still have increased processing power, It's a good thing I suppose. But honestly my battery life is sufficient enough that I don't want to lose processing power just to decrease the watts. I'm not suggesting that they implied that, just sharing my .02.
 
I've read about the ARM since it's first use in the Newton. and in my understanding, the ARM is a pure RISC design, a very small core built with efficiency in mind. They don't have branch prediction and deep execution pipe like x86 processor, limiting their effective power in desktop environment. It's like comparing a regular 3L V6 engine with a 1.6 turbo V4 running at 11,000 RPM, both could achieve about the same HP. But the V6 can be push more ahead burning fuel and the V4 will have better fuel efficiency at low speed. While ARM is already push to it's limit, core multiplication and expending the base design of ARM can obliterate those limit in near future.

The interesting part come from Intel, saying right now ARM mobile CPU is growing twice as fast as the Moore Law.

I could see why ARM would be going twice as fast as Moore for little while. My guess is because it only more recently been really developed and pushed so it is more or less playing catch up and using tricks and technology learned from the other CPU lines over the years. I am willing to bet it will slow down and drop to moore law speed after a while.

Really don't understand what you mean. Are you saying work will become less intensive, or processors will become faster+more-energy efficient? or are you saying software will become multi-threaded allowing it to leverage multiple energy-efficient cores to get performance, making it both fast and less energy?

He is just repeating Apple catch phases and his church of Apple worship.

I will tell you multithreading/multicore coding is hell to do in programming and a huge pain in the ass to get it all working correctly because so many more things can go wrong plus you have to make sure they are not trying to write or change the same set of data at the same time. Single threading is so much easier to code and design for than multi threading.
 
I could see why ARM would be going twice as fast as Moore for little while. My guess is because it only more recently been really developed and pushed so it is more or less playing catch up and using tricks and technology learned from the other CPU lines over the years. I am willing to bet it will slow down and drop to moore law speed after a while.



He is just repeating Apple catch phases and his church of Apple worship.

I will tell you multithreading/multicore coding is hell to do in programming and a huge pain in the ass to get it all working correctly because so many more things can go wrong plus you have to make sure they are not trying to write or change the same set of data at the same time. Single threading is so much easier to code and design for than multi threading.

LOL @ worshipping Apple. I hear you about multi-threading, actually thats what I studied in graduate school. i always say this: Multi-threading is about taking the hardware's job of finding ILP and assigning it to the programmer in order to save power. No pain without gain, hence the toughness. I wrote a small article about it recently which may interest you (http://bit.ly/lkIair).
 
ARM ISA is a RISC design just like the PPC, Sparc or Alpha, the ARM acronyme stand for Acorn Risc Machine or Advance Risc Machine. Over time with Thumb, Neon, Jazelle, VFP it become bloated like you said, but the core is a straight simple RISC processor based on Berkley RISC projet.

ARM on wikipedia


I understand where you are coming from. Actually I dislike this whole RISC vs CISC debate because boundaries are hazy and there is no substance to it as such (this is coming from someone who architects processors for living). You are right about ARM starting as RISC, but even the Wikipedia article points out that features has been added to the ISA since then "To compensate for the simpler design, compared with contemporary processors like the Intel 80286 and Motorola 68020." these ISAs start as "RISC" and end up at the same place. Sun SPARC when it started did not even have a multiply instruction. You had to write a loop to perform multiply. Then so many of those things got added over time. Hence, my point about the debate being just silly.
 
Sun SPARC when it started did not even have a multiply instruction. You had to write a loop to perform multiply. Then so many of those things got added over time. Hence, my point about the debate being just silly.

In the end, RISC wasn't about reducing the *number* of instructions - it was about reducing the *complexity* of the ISA. A simple ISA can be decoded very quickly, and makes multiple-issue and other optimizations easier to implement. (It was also very significant that the transistor count exploded during the timeframe of the RISC vs CISC debate. The Intel Pentium processors debuted with a transistor count of 3.1 million - a Core i7-970 has over a billion transistors.)

Yes, I agree that the argument is silly - especially since Intel figured out with the P6 (5.5 million transistors) how to turn x86 into a RISC ISA.
 
Last edited:
LOL @ worshipping Apple. I hear you about multi-threading, actually thats what I studied in graduate school. i always say this: Multi-threading is about taking the hardware's job of finding ILP and assigning it to the programmer in order to save power. No pain without gain, hence the toughness. I wrote a small article about it recently which may interest you (http://bit.ly/lkIair).

I am still in school learning about all of it but I know when I had to multithread it was a pain in the ass just getting it to work.

I do believe multicore was going to have to happen but a lot of work needs to be done to force the CPU to do more of the work instead of us the programs have to do it and we the programs are limited to max speed of a single thread at the close speed of a single core.
 
I am still in school learning about all of it but I know when I had to multithread it was a pain in the ass just getting it to work.

I do believe multicore was going to have to happen but a lot of work needs to be done to force the CPU to do more of the work instead of us the programs have to do it and we the programs are limited to max speed of a single thread at the close speed of a single core.

Nice. I am very curious, what library/framework are you using for multi-core. Is it pthreads> If so, I know that (void*) (void *) is a pain. OpenMP is restrictive, but cleaner.
 
$11 difference on the retail, for apple it might be zero difference

Yup. However, I disagree that $11 is nothing for Apple. $11 is nothing for them but $11 is a big deal unless they charge that to their customers. Its in the volume so if Apple's cost goes up by $11, and they sell 10 million, they loose 110 million in total. Cost is pretty significant. It is interesting to note that they did not put a front camera on all devices even though it only costs them $1 (http://bit.ly/kkuwOG).

Edit: it just occurred to me that you meant that Intel will give Apple a discount. If thats the argument, I have no data to disagree it so I would concede :)
 
I understand where you are coming from. Actually I dislike this whole RISC vs CISC debate because boundaries are hazy and there is no substance to it as such (this is coming from someone who architects processors for living). You are right about ARM starting as RISC, but even the Wikipedia article points out that features has been added to the ISA since then "To compensate for the simpler design, compared with contemporary processors like the Intel 80286 and Motorola 68020." these ISAs start as "RISC" and end up at the same place. Sun SPARC when it started did not even have a multiply instruction. You had to write a loop to perform multiply. Then so many of those things got added over time. Hence, my point about the debate being just silly.

I should've moderate more myself, about how arm is "better" than others. I totally agree with you on RISC vs CISC debate. The truth is that I never like much intel processors over the ages, but I've got to admit x86 could be the best general use processors. I still thinking it's silly for Intel trying to put a x86 into a phone when other design do the job just fine.

BTW I've read your discussion about multithreading and multicore with Rodimus Prime. I'm not a programmer, i'm more of an OS guy (got ACSA cert) But I would like to know what do you think about Apple way to solve those problems with C ^block and grand central dispatch?
 
It's LTD - he just spouts off stuff.

The whole idea of a "processor-intensive" application is that it typically loads up most or all cores to nearly full load (or keeps them at full load). As processor technology continues to advance, yes some processor-intensive applications become less-so due to the inherent architectural and/or speed enhancements made to the processor. However, as history has shown, when more-powerful systems become available, many developers strive to take advantage of it, so we'll always have "processor-intensive" applications in some form.

For the average Mac user? LTD can make a case, albeit weak. For quite a few professional Mac users (and many professional non-Mac users), those comments are just dribble.

Well true but aren't these same 'Processor-Intensive processes'* are the ones that have the value to pour in to finding ways to use more cores as well as different cores like GPU processing?

It seems to me that most of these processes are the ones being used in Team situations not just one off users. In which case it becomes a trade between having redundant capacity in each workstation and a server or find a high enough bandwidth way to connect to the a central cluster that allows each process to use as much of the capacity as the office has.

Seeing the bandwidth problem is getting wider and wider. Then it could be Intel see a switch in the next few years to lighter clients and heavy central cores. In which case the best core product for them would be one of these 15w cpu's that could handle the user personnel demand (email,communication,interface) but give it lots of bandwidth so that it can be the data coordinator of the users actions within a team/cluster environment.

It would seem that at 15w Intel has more room for hanging lots of bandwidth to memory, to other processors, to displays, to storage, to ports that a 1W ARM just can't match.

*It's not like the whole app is Intensive the interface is going to send it's time between waiting for the user to react and waiting for the Intensive Process to get done dealing with that.
 
I should've moderate more myself, about how arm is "better" than others. I totally agree with you on RISC vs CISC debate. The truth is that I never like much intel processors over the ages, but I've got to admit x86 could be the best general use processors. I still thinking it's silly for Intel trying to put a x86 into a phone when other design do the job just fine.

I agree with you that its hardly a technical debate of x86 or ARM being better. Its really not a religious one either. Its purely economic. x86 phones makes less economic sense for everyone except Intel. However, if Intel can manage to get x86 into the phones, their cash cow lives. They try x86, not ARM, because they know how to do x86. Trying ARM will cost resources to learn ARM and do a brand new design -- a very expensive task that has led to the death of many companies. I guess I keep on going to back to the same "its all about the dollar" argument (http://bit.ly/laQ6Y8). When you have a hammer, everything looks like a nail:)

BTW I've read your discussion about multithreading and multicore with Rodimus Prime. I'm not a programmer, i'm more of an OS guy (got ACSA cert) But I would like to know what do you think about Apple way to solve those problems with C ^block and grand central dispatch?

I have used GCD but not very much. I generally like the GCD approach but I also don't think its the holy grail (I explain below).

A few things in GCD are similar to previous frameworks like Intel TBB and Cilk, but then there are some nice treats as well. The best is the ability to use the same core for tasks from different processes without paying a full context switch overhead. This allows very fine-grain resource management which can be a huge win in servers and consumer PCs. Another thing I like about it is the clear distinction between SERIAL and PARALLEL queues. The serial queues service requests inside mutexes etc. The serial queue always runs the critical code on a single core which has some cache locality benefit. Enforcing serialization this way also eliminates some expensive lock acquire and release operations.

I don't like to call it a complete solution because it doesn't solve THE hard problem in multi-threading: the act of finding parallel work in a program. I often phrase it as follows: It provides a nice mechanism for specifying parallelism, but does not help with identifying it.

Sorry for the long incoherrent rambling. I hope it makes some sense. I will try to write a blog post about this asap and post a link. I can PM or you can follow me on twitter/RSS etc. Thanks.
 
Well true but aren't these same 'Processor-Intensive processes'* are the ones that have the value to pour in to finding ways to use more cores as well as different cores like GPU processing?

I don't agree with that. Tasks that don't go away with multi-core do not get solved by GPU either because they both exploit data-level parallelism (http://bit.ly/m5iv0Q). In fact, multi-core can extract more types of parallelisms, so if a task stay intensive in presence of multi-cores, it won't go away with a GPU. An example of such a task is Firefox. Each individual tab still stays as a single thread so intensive web apps require high CPU intensive (my Firefox analysis is from 2009, pls correct me if this has changed).

It seems to me that most of these processes are the ones being used in Team situations not just one off users. In which case it becomes a trade between having redundant capacity in each workstation and a server or find a high enough bandwidth way to connect to the a central cluster that allows each process to use as much of the capacity as the office has.

Seeing the bandwidth problem is getting wider and wider. Then it could be Intel see a switch in the next few years to lighter clients and heavy central cores. In which case the best core product for them would be one of these 15w cpu's that could handle the user personnel demand (email,communication,interface) but give it lots of bandwidth so that it can be the data coordinator of the users actions within a team/cluster environment.

Actually you do bring a great point here. Yes, IF everything runs at the server, we will have abundant parallelism and won't miss big cores as much. Unfortunately, it doesn't fit many usage models due to network latency 9network bandwidth is improving as you mention but latency is not). It prohibits us from doing interactive applications in a server-client fashion (e.g., you can't play an interactive game if every key stroke takes 300ms to respond). I don't say your point is invalid, I am just saying that what you propose will not be practical for many situations.

It would seem that at 15w Intel has more room for hanging lots of bandwidth to memory, to other processors, to displays, to storage, to ports that a 1W ARM just can't match.

Disagree with that. Whatever Intel does, ARM could do as well if their customers are willing to pay. Bandwidth is a strict function of how much money you charge for a chip. Sun Niagara-1 had an insanely high bandwidth for its time because they charged 2x the price of the highest-end Intel. So if ARM competes in Intel's space, they will raise the price and up the bandwidth.

*It's not like the whole app is Intensive the interface is going to send it's time between waiting for the user to react and waiting for the Intensive Process to get done dealing with that.

This is incorrect. The whole app being intensive is irrelevant. What matters to a user is how much time the computer takes to respond when he/she clicks something. Thus, the fact that you have to wait for the user doesn't help reduce the need for CPU intensive.
 
I don't want to derail the discussion, but I was curious:

With this announcement from Intel about the direction they want to go, the tight relationship Apple and Intel clearly have, and the not uncommon occurrence of Apple getting chips that Intel hasn't officially released yet, would it be entirely unrealistic to think that the MBA's rumored for June/July may see a processor better than the i7-2657 (in the 11)? Perhaps a 17w 1.66 or 1.83 (stretching it) or something that just runs its HD3000 a little faster than 350/1000.

Obviously we're not going to see the 10-15W range they're talking about show up in Sandy Bridge but perhaps they've got something up their sleeve to show they're serious. It's probably a lot to ask for but it seems like a new MBA would be the ideal place for Intel to say, "See, we're already upping the speed in the ULV range."

I haven't closely compared the chips Apple has gotten early with what was available to other OEMs at the time so I'm not entirely sure how big the gap has been in the past. I expect it's minor.

Edit: I suppose if it happened it would be something like a 1.7 given the pattern of the i3/i5/i7 clock speeds.
 
"than", not "then"

"will now target a much lower power draw then present chips"

This happened, THEN that happened.

He is bigger THAN me.

... lower power draw THAN present chips
 
So what's the bet that Apple exploit this by making thinner devices with thinner batteries with the same battery life rather than using the same devices/batteries to get the "all day" power that they probably will now allow.

I've been desperately hoping for Apple to stop making stuff thinner, for just one generation, and reset the bar on battery life a lot higher than it has been. Instead of slowly getting thinner, and slowly getting more battery life (or not moving at all on battery life), just give us double the battery life when the energy densities and chip efficiencies improve, then after that, get thinner and maintain that new much higher battery life.

It'll never happen. Apple loves their "it's x% thinner" charts. I say, it's thin enough, and want an all day device, but that doesn't seem to sell as well or everyone would be packing more battery life into their devices and we'd be living a fully mobile tech utopia.
 
I am still in school learning about all of it but I know when I had to multithread it was a pain in the ass just getting it to work.

I do believe multicore was going to have to happen but a lot of work needs to be done to force the CPU to do more of the work instead of us the programs have to do it and we the programs are limited to max speed of a single thread at the close speed of a single core.

That's why apple invented grand central dispatch. Synchronizing threads is probably best left in the hands of software due to the complexity involved, like you said, so I think GCD is probably one of the best parts of snow leopard.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.