Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Huh? So now you are claiming fewer pipeline stages is better performance?

Yes, of course it gives you better performance. It can just be harder to hit as high of clock speeds at the same process.

Out-of-order machines need much beefier branch predictors because the branch miss penalty is so much higher, however. In-order machines typically get away with branch predictors not much more complicated than:

How is the penalty higher?
 
The more information that comes about the iPad, the more it makes me not want it. Apple should have offered pre-orders the first day.
 
Jons articles are usually pretty good but this one is revolting and makes me wonder if someone else has possesed his brain for a bit. He usually doesn't mess up technical issues this much.

What's wrong with it?

[/quote]The number one issue is his implication that stripping off I/O will some how make the processor faster.[/quote]

How does he imply that?

I thought it was a good article, and it's interesting in his view that Apple may still have a custom chip coming, this just isn't it.
 
Yes, of course it gives you better performance. It can just be harder to hit as high of clock speeds at the same process.



How is the penalty higher?

The penalty is higher because you have to unwind more speculative state.

And no, having fewer pipe stages most certainly does not, all else equal, give you better performance. Now you're just trolling.

Jons articles are usually pretty good but this one is revolting and makes me wonder if someone else has possesed his brain for a bit. He usually doesn't mess up technical issues this much.

The number one issue is his implication that stripping off I/O will some how make the processor faster. Removing unused I/O will have zero impact on operational performance as the CPU would have nothing to do with the unused pheripherials.

Removing unused I/O allows more power budget for the rest of the chip, and, more importantly, eliminates long paths that set the clock rate (it is quite common for I/O paths to be the timing critical paths in CPU designs). It also allows other functional blocks to be re-located in closer proximity, improving timing on those other paths.
 
The penalty is higher because you have to unwind more speculative state.

Huh? It's just dropped-that's no more penalty than on an in order CPU.

And no, having fewer pipe stages most certainly does not, all else equal, give you better performance. Now you're just trolling.

I'm "trolling" because I'm stating a well known fact?
 
So, cmaier, what is your take on this news in general?

It's not much of a surprise - I'd been saying on here for awhile that there was no way PA Semi finished the thing it is working on so quickly. Personally, despite having designed microprocessors for years (mostly at AMD), I'm not big on a bunch of checklist features, mostly because I understand how every design decision is a tradeoff. So, while in practice an A9-based design might be slightly faster, I think the A8-based design will be plenty fast enough for most anything I'm likely to use an iPad for.

Pretty much I think it's like vegetables hidden in your meatloaf. Most of the whiners, if handed an iPad and not told what processor was in it, would assume that since it's so fast it must be an A9. It's only when they are told what the chip is that suddenly it's not good enough.

Huh? It's just dropped-that's no more penalty than on an in order CPU.



I'm "trolling" because I'm stating a well known fact?

You're trolling because a few pages back, when you though the A9 had more pipeline stages, you were of the opinion that MORE was better. Now that you know the A8 has more stages, you believe LESS is better.

As to your "it's just dropped," think about how much is "just dropped"! You speculatively perform a branch (incorrectly) and out-of-order perform additional instructions on top of the branch. You go much further down the wrong branches of the tree of possibilities, sometimes even multiple branches, before you find out you guessed wrong. Thus the penalty for guessing is higher - you've done much more speculative work for nought.

In an in-order machine, the branch penalty is only that you've done the wrong icache access, so you have to pay for another icache access. In an out-of-order machine, you've gone and executed a bunch of instructions that you shouldn't have, perhaps storing speculative state in the register file, and you need to unwind all that. Additionally, as a percentage of IPC (and from a power consumption perspective) you've done a lot more work that you didn't need to.
 
Of course Apple's videos of the device were all FAKED!
They showed FLASH support! HAHA!

There was no Flash support showed in the Keynote video. Did you even watch it?
This device is supposed to replace or be better than a PC NETBOOK, yet costs more, and yet isn't even compatible with FLASH websites & basically every website you go to uses FLASH unless you live in Steve Jobs "Reality Distortion Field" world LOL

Maybe every website YOU visit requires Flash, that is not true for me. Around 1 in 10 I visit need Flash. Apple is being brave in taking on Adobe regarding Flash, face it, it's a bad technology and it needs someone with the guts of Steve Jobs to eradicate it.


On the plus side is the iTUNES/iPhone App store, which seems to lately be Apple's standard crutch for success. Only iTUNES & the App Store could fuel such a crappy device to be hugely popular.

Maybe I'll be proved wrong, but I honestly think this device will not fly.
Everyone, even me, seems to want one, but keeps finding out more and more things the device cannot do!

The Reality Distortion Field only lasts so long except for the fanboys here.

I'm thinking more like waiting for Revision B or C for the iPAD.
And I bought an iPhone on Day 1 for the record.

Like it or not, you're caught in the 'Reality distortion field' that you are criticizing. You are either a hypocrite or a flamer. Which is it?
 
Looked plenty fast enough in the videos, not like I'm doing Handbrake rips on it.


no you wont be doing Handbrake unless Apple lets you.

that should be the slogan for the iPad " iPad - Not Unless Apple Wants You To" :D

or maybe

"iPad - Think The Same"
 
That would be a network issue.

I noticed lag I see even on the iPhone 3Gs when viewing some larger PDF documents.
Watch when Steve opens the email attachment which holds a map of the valley, when it first loads you can see the video fill certain parts slower than others, as well as zoom in with the out of focus text, until the iPad completes the zoom to clear and sharpen the display. :eek:
The map wasn't likely a PDF. Further the map was being loaded over a network connection. Any thing network related will have variable performance due to the nature of networks.

What is clear about iPad is that you don't want to jump to purchase the device. There is a good possibility that performance will be worst than many can accept. Above some have indicated that the processor doesn't matter, for many it doesn't.

The problem though is acceptable performance which by the way can be had with a number of processors. One measure of acceptable performance is user interactivity, that is how well the system responds to user input. Another might be how well the unit plays back video or burns battery time. The processor and it's capabilities has a big impact there so it is important. The problem is some don't care as long as the device meets their needs. This is much like the woman that buys the car and never even considers what motor is in the thing. Nothing wrong with that accept that you really don't know what capabilities you are buying.

Here is the bigger issue, in these ARM SoC the processor core is often only a partial indicator of what you are getting. Sure the A8 is slower than an A9 but that is not the entire equation. For one you have the GPU which can have a very significant impact on performance when properly used. Even bigger unknowns are the other blocks likely to be on the SoC. Here we are talking about the possibility of custom vector units, video decode units or other Apple IP. One of those others might be a custom processor for touch screen control. The problem is we know nothing about what is or might be supporting the A4s ARM core whatever it is. So getting hung up on the core is just silly if you can't get the rest of the formula.
 
You're trolling because a few pages back, when you though the A9 had more pipeline stages, you were of the opinion that MORE was better. Now that you know the A8 has more stages, you believe LESS is better.

Maybe you need to reread what I actually said.

As to your "it's just dropped," think about how much is "just dropped"! You speculatively perform a branch (incorrectly) and out-of-order perform additional instructions on top of the branch. You go much further down the wrong branches of the tree of possibilities, sometimes even multiple branches, before you find out you guessed wrong. Thus the penalty for guessing is higher

That's dead wrong. It's only doing that if it has time to spare. Yes, it's throwing away more work, but it's work an in order core wouldn't be doing to begin with. It's not a penalty in comparison, it's just an advantage when it works, which is the majority of the time.
 
Maybe you need to reread what I actually said.



That's dead wrong. It's only doing that if it has time to spare. Yes, it's throwing away more work, but it's work an in order core wouldn't be doing to begin with. It's not a penalty in comparison, it's just an advantage when it works, which is the majority of the time.

I read what you said. Why don't you read what you said. You said "we agree" that more stages is faster when you thought the A9 had more stages.

And you're wrong. When I've speculatively gone down multiple branches of an execution tree, and one of them is wrong, I may have already paid cache accesses for subsequent incorrect decisions (or decisions that would have been correct if only my first decision was correct). I also have to unwind the physical register files, which is not for free. They cannot just be "dumped" because some of their contents will inevitably be kept since they are correct - I have to retire the instructions that were correct and in-flight at the point where the incorrect branch was taken.
 
And you're wrong. When I've speculatively gone down multiple branches of an execution tree, and one of them is wrong, I may have already paid cache accesses for subsequent incorrect decisions (or decisions that would have been correct if only my first decision was correct). I also have to unwind the physical register files, which is not for free. They cannot just be "dumped" because some of their contents will inevitably be kept since they are correct - I have to retire the instructions that were correct and in-flight at the point where the incorrect branch was taken.
hey, that was actually pretty insightful! i'm a performance-sensitive software engineer but i had never really stopped to think until now what the costs may be from mispredicted branches in an out-of-order design. heck, i guess that with sufficiently bad luck in OoOE you could kill your carefully gathered dcaches, and watch everything crumble down.. ok, now i'm scared.

*clenches to his in-order/shallow-OoOE ppc oldies*
 
Let me ask you, why not buy a Tegra 2 tablet, it's many times more powerful and can do a whole lot more, USB, SD Card 1080p on hdtv (575p for Ipad) and the one I saw costs $325.


Can you direct me to where you saw it?

Thanks
 
What you say makes no sense. The A9 is better than the A8, and the ipad would have been more powerful if it had an A9, because of the fact. There's no way Apple could have made it better with and A8 than an A9, and that's just common sense.

It only makes common sense IF raw power is your only measure. I think Apple was also looking for cooler operation and power efficiency ... the latter more likely then anything.

Common sense says it only needs to be fast enough to do the job it is meant to do while operating as long as possible between recharges.
 
Maybe you need to reread what I actually said.



That's dead wrong. It's only doing that if it has time to spare. Yes, it's throwing away more work, but it's work an in order core wouldn't be doing to begin with. It's not a penalty in comparison, it's just an advantage when it works, which is the majority of the time.

Now, this is not meant to be rude, but cmaier is a guy with a lot of real world experience. There is a huge difference between designing real processors for years at AMD, and taking a few upper division CSE design courses on cpu design, which is honestly the only place I can come from on the matter myself. I don't work in CPU design and likely never will. I think on this topic, it's pretty nice to have a guy like cmaier around, especially since he puts up with a lot of silliness and still returns.
 
So can I infer from that very good explanation that multitasking is the chief advantage of the A9 and this is the explanation why there will be no multitasking for this generation of the IPAD if it uses the older A8?

It actually doesn't have anything to do with multi-tasking. Both A8 and A9 multitask just fine. The problem is that Apple is worried what will happen to battery life if they put in multitasking, and also that iPhone OS does not yet support disk-backed virtual memory, which means if you run out of memory you are out of memory. Allowing multiple apps at the same time puts a squeeze on the available memory and so you've got to have disk-backed virtual memory to support it.
 
It actually doesn't have anything to do with multi-tasking. Both A8 and A9 multitask just fine. The problem is that Apple is worried what will happen to battery life if they put in multitasking, and also that iPhone OS does not yet support disk-backed virtual memory, which means if you run out of memory you are out of memory. Allowing multiple apps at the same time puts a squeeze on the available memory and so you've got to have disk-backed virtual memory to support it.

Yet to be seen how much physical memory is available on this beast.
 
The 4 in A4 is for quad core A9. I swear. Check out the white paper on the ARM site.

You know. Small iPhone, small battery. Big iPad, ALL battery. Does it really need that much lithium to happily run that backlight? Must be the cores.

/k
-------------------
I just noticed I have a Flash player on my SONY CLIé. I'm guessing it still won't youtube.
 
It only makes common sense IF raw power is your only measure. I think Apple was also looking for cooler operation and power efficiency ... the latter more likely then anything.

Common sense says it only needs to be fast enough to do the job it is meant to do while operating as long as possible between recharges.

A9 runs cooler.
 
A9 runs cooler.

Yes and no. Since it can achieve the same performance at a lower frequency (higher IPC) it has less switching power (P=CV^2f). On the same process node it will run at the same voltage but have more leakage than an A8. If run at the same frequency as an A8 it would burn more power as there is more circuitry switching in each cycle. (Of course, that all assumes the implementation is the same - same clock gating, etc. Assuming we are talking soft cores the implementation varies from design to design).
 
Ssd?

It's not much of a surprise - I'd been saying on here for awhile that there was no way PA Semi finished the thing it is working on so quickly. Personally, despite having designed microprocessors for years (mostly at AMD), I'm not big on a bunch of checklist features, mostly because I understand how every design decision is a tradeoff. So, while in practice an A9-based design might be slightly faster, I think the A8-based design will be plenty fast enough for most anything I'm likely to use an iPad for.

Pretty much I think it's like vegetables hidden in your meatloaf. Most of the whiners, if handed an iPad and not told what processor was in it, would assume that since it's so fast it must be an A9. It's only when they are told what the chip is that suddenly it's not good enough.

How much of the performance and user experience of an iPad is due to the use of an SSD?

You can take a decent SSD and put it on very old computer hardware and still get snappy performance, as Runcore did with old Windows laptops.

http://arstechnica.com/gadgets/news/2010/01/old-computers-get-young-with-ssd-upgrades.ars
 
How much of the performance and user experience of an iPad is due to the use of an SSD?

You can take a decent SSD and put it on very old computer hardware and still get snappy performance, as Runcore did with old Windows laptops.

http://arstechnica.com/gadgets/news/2010/01/old-computers-get-young-with-ssd-upgrades.ars

It's not an SSD. It's just NAND flash memory. But your point is valid. File access is a limiting factor, though many iphone apps don't do much file access (and, as someone pointed out, there's no memory swapping going on). Still, if you replaced iPhone's flash with a 7200rpm hard disk, it would certainly feel slower.
 
It actually doesn't have anything to do with multi-tasking. Both A8 and A9 multitask just fine. The problem is that Apple is worried what will happen to battery life if they put in multitasking, and also that iPhone OS does not yet support disk-backed virtual memory, which means if you run out of memory you are out of memory. Allowing multiple apps at the same time puts a squeeze on the available memory and so you've got to have disk-backed virtual memory to support it.

This is quite a glob of jibberfish. I ran yes > /dev/null & to give it some utilization. So the CPU is near 100%, the battery is't exactly flatlining even after 20 minutes of this. And you'll notice it thinks it has 814M of VM.

I can have streaming radio playing in the media server (7%) while yes is running (81%) while playing Wurdle (4%) and sending an email. There is no problem multitasking.

It's an iPod touch 2g. (not an A8 I know.) I didn't want to risk my iPhone.

/k


Processes: 25 total, 2 running, 23 sleeping... 95 threads 17:51:33
Load Avg: 1.12, 1.09, 0.78 CPU usage: 94.74% user, 5.26% sys, 0.00% idle
SharedLibs: num = 0, resident = 0 code, 0 data, 0 linkedit.
MemRegions: num = 1796, resident = 29M + 0 private, 15M shared.
PhysMem: 28M wired, 11M active, 8068K inactive, 81M used, 35M free.
VM: 814M + 0 36215(0) pageins, 277(0) pageouts

PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
294 yes 93.5% 1:53.00 1 14 25 184K 6536K 352K 8888K
295 top 6.1% 0:07.07 1 21 34 364K 6708K 844K 17M
272 bash 0.0% 0:00.14 1 15 49 252K 6720K 856K 19M
242 sshd 0.0% 0:01.22 1 15 34 300K 6492K 576K 18M
237 MobileSafa 0.0% 0:03.98 10 124 150 4352K 7644K 6084K 75M
170 MobileMusi 0.0% 0:01.69 3 105 84 2860K 7320K 3468K 69M
63 MobileMail 0.0% 1:06.37 5 146 107 2984K 7316K 3884K 69M
60 apsd 0.0% 0:08.23 2 46 36 464K 6492K 708K 17M
56 BTServer 0.0% 0:19.59 5 93 69 948K 6492K 1248K 31M
36 aosnotifyd 0.0% 0:14.28 3 96 66 1068K 6696K 1424K 41M
35 CommCenter 0.0% 0:00.95 4 78 58 676K 6696K 900K 42M
31 SpringBoar 0.0% 2:53.21 10 321 484 7716K 7648K 18M 86M
30 accessoryd 0.0% 0:02.01 1 53 29 336K 6484K 552K 17M
29 dataaccess 0.0% 0:05.85 6 121 60 1500K 6688K 1652K 42M
28 fairplayd 0.0% 0:00.78 1 34 38 384K 6480K 536K 18M
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.