Apple Silicon: The Complete Guide

cmaier · Jun 10, 2020

curmudgeonette said:
I'm starting to take issue with current CISC processors being described as a complex instruction decoder wrapped around a RISC core. If you do that, then you need to describe the 8080 and 6502 as CISC wrappers around a RISC core.

For decades now, processors have had an orthogonal general purpose core that's controlled by the instruction decoder. In a 6502, the 8-bit ALU that performs your "add to accumulator", is the same ALU that (when double pumped) calculates the effective address for indexed addressing modes.

You have to look way back to find processor architectures where each different instruction is executed by its own circuitry. I think some of the 4-bit embedded microcontrollers are like that. Also perhaps processors built by students out of TTL jellybeans.

RISC is about reducing instructions that have multiple resource and scheduler hogging steps.

The difference is that starting around 1992 or so, you started having CISC processors where the instruction decoder is decoupled in such a way that it has its own state machine and a microcode ROM, and there is a separate instruction pointer different than the ISA instruction pointer; what happens is that CISC instructions are broken into a micro-op sequence of risc-like instructions, which are issued to the ALUs/fetch units. These microops may be issued out of order based on dependency analysis, etc. The microops are risc-like in that they (1) have fixed length; (2) have standardized formats; (3) cannot access both memory and registers in the same instruction; (4) are “simple” (in various ways).

So, as far as the scheduler/register renamer/ALUs/Load-store unit are concerned, they see only “risc” instructions.

eflx · Jun 10, 2020

cmaier said:
This makes no sense. CISC is slower than RISC, all else being equal. CISC has much more complicated paths, requiring many more gates between flip flops, meaning that it is slower. And the extra work load that can be done by 1 CISC instruction can almost always be done in N RISC instructions in less time.

RISC is faster in that execution time is improved, but the entire purpose of CISC is to provide a “single instruction” that allows you to do what would take a RISC based design, multiple more clocks. CISC is quite clearly designed to help improve the performance of more complex calculations/routines.

There’s always a trade off - not sure why people are getting pissy with my comment. That’s the entire basis of the two different designs ... and what about in order vs out of order operations? How does a modern day RISC cpu compare?

If you bench ARM cores on certain types of OpenSSL/encryption benchmarks you can see sometimes you have a good advantage for the ARM designs, but in general and overall between single and multi-threaded tears, Intel (x86) CPUs hold a lead. Sometimes by a staggering amount.

Since you guys are experts I’m sure you can explain.

cmaier · Jun 10, 2020

efluxdesign said:
RISC is faster in that execution time is improved, but the entire purpose of CISC is to provide a “single instruction” that allows you to do what would take a RISC based design, multiple more clocks. CISC is quite clearly designed to help improve the performance of more complex calculations/routines.

There’s always a trade off - not sure why people are getting pissy with my comment. That’s the entire basis of the two different designs ... and what about in order vs out of order operations? How does a modern day RISC cpu compare?

If you bench ARM cores on certain types of OpenSSL/encryption benchmarks you can see sometimes you have a good advantage for the ARM designs, but in general and overall between single and multi-threaded tears, Intel (x86) CPUs hold a lead. Sometimes by a staggering amount.

Since you guys are experts I’m sure you can explain.

CISC made sense before compiler technology got to the point where it is now. There is no advantage to doing more per cycle when it takes more than N-times the power/time to do it.

Trusteft · Jun 11, 2020

jdiamond said:
That might be fine with desktop Macs, but the thermals on notebooks are just awful, and the fan noise is so loud...

That's because they go for as thin as possible without caring for anything else. Take a look at the latest Air model where the fans are not even connected to the heatpipes.
They are like a former (insert fat/alchocolic/dirty/whatever) who has physically changed and now is the worst company ever as they are obsessed with their new self.

silvermacpro · Jun 11, 2020

Dear Tim I 'am very and very angry on Apple, because Apple want to the transition Macs on to the ARM chips. And I can't paying Mac OS X and Windows games on Mac's and some professional software on Windows!!!!!!
[automerge]1591863021[/automerge]
I 'am very and very angry on Apple, because Apple want to the transition Macs on to the ARM chips. And I can't paying Mac OS X and Windows games on Mac's and some professional software on Windows!!!

honglong1976 · Jun 11, 2020

jdiamond said:
Back in 2006, Apple brought out the first x86 based Macs, and it only took them 12 months to switch them all. A big key back then was Rosetta Stone (to run the old software unchanged). I have to wonder if one reason Apple is ditching 32-bit desktop apps and promoting iPad Apps on Mac is to prepare for such a transition.

I would imagine Apple have a plan for the next 10 years and everything is working towards this.

Imagine a Macbook Air that can run MacOS apps and all the latest iOS apps. That's what I envisage.

Osamede · Jun 11, 2020

cmaier said:
Apple has made the last several iPhones THICKER and with more battery life. What are you talking about?

There have been devices where improved chip efficiency is accompanied by shaving DOWN size of battery pack.

smulji · Jun 11, 2020

honglong1976 said:
Imagine a Macbook Air that can run MacOS apps and all the latest iOS apps. That's what I envisage.

Current x86 Macs can already do that. I suspect there has to be a much grander vision than just being able to run macOS and iOS apps.

seclusion · Jun 11, 2020

I also think they likely have been testing MacOs on arm since the first iPad came out, why not. So maybe they've eliminated a lot of hurdles. Would be nice if it's possible not having fans in the computers. Also it may end up meaning a closed system

DanBig · Jun 12, 2020

cmaier said:
This makes no sense. CISC is slower than RISC, all else being equal. CISC has much more complicated paths, requiring many more gates between flip flops, meaning that it is slower. And the extra work load that can be done by 1 CISC instruction can almost always be done in N RISC instructions in less time.

Well, that's not really true! This gets into the instruction sets each has and what the given code is doing. Most of the CISC Vs RISC issues had to do with how the chip ran the assembler code, as an example: The way the bits are written and read, so the decimal value 15 is either 00001111 vs 11110000 while we humans think the most significant (higher value) to the left (1 - ten's and 5 ones) to be clear this is really done as the nibble level of four bits, just easer to see at the byte level. CISC chips for the longest time had to buffer the nibble or byte (depending on the generation) to then flip the bit order so the processor could run it. This created an extra step.

Today many of these handicaps have been removed so a CISC chip is more like a RISC chips! And some of the benefits of CISC architecture is now inside RISC chips too! As an example: The bit order I spoke about above was resolved in the ARM design which can run either order! Todays chips are more an amalgam of both architectures.

But the more complex instructions the CISC chip offers can be more efficient than the simpler RISC instructions! It all depends on the whats needed. If a simple action is required then the RISC instruction set can be faster! It all depends on what the app (user) needs.

The real issue is the investment of your apps and the learning curve moving to a different platform. Apples departure of 32bit apps was for many pro's going to far! As tools we use are mostly 32bit and technically don't need to be 64bit at all.

I can handle some pain, but that was just going too far! And really needless for me to suffer just to make the coming emulation engine needed when converting to ARM architecture easier to build and test. Why do you think many people are struggling with Catalina! I'm sticking with Mojave!

eflx · Jun 12, 2020

cmaier said:
CISC made sense before compiler technology got to the point where it is now. There is no advantage to doing more per cycle when it takes more than N-times the power/time to do it.

I don't think you really know what you're talking about. There's more to it than just "compiler" technology and as I've mentioned previously, a lot of x86 chips and RISC designs available, do kind of combine a bit of both traditional "CISC" and "RISC" design elements though clearly ARM and x86 aren't the same instruction sets.

The two of you replying to me previously, are going on "half" the information, and speaking half the truth. There is an advantage to doing more per cycle - how could there not be? The difference really is if X amount of cycles, can complete before a single cycle processing the same chunk of data.

There are some really cool power-saving functions and capabilities found in these latest ARM designs, and in specific scenarios they seem like amazing little performers. But you can clearly see, across the board (not paying attention to power consumption) x86 platforms are still capable of providing more raw performance. In large parts, due to the CISC architecture, which provides complex instruction sets for saving cycles. Saving/reducing cycles, clock-for-clock, saves processing time - and in which case, the "next" clock cycle(s) can move onto the next task.

So again ... SOMETIMES, for simple tasks, the sheer relative simplicity of a RISC design can provide faster execution of simple calculations. I really think it (quite clearly) depends on the usage case. It would seem for software you often find running on your average MacBook Air, and iPads - those kinds of tasks are well suited to the ARM CPU's.

You can't find benchmarks anywhere online, where one of these ARM chips comparatively kicks the crap out of x86 except for in very niche tasks. Go look for yourself ...

I think what Apple is doing, such as with the AfterBUrner card, the T2 chip, the "Biometrics" co-processors etc. is a very smart usage of their ARM based processor designs. Creating specific designs for example, ARM based chips to accelerate video encoding and decoding, is a very promising looking solution to not only harness the existing software, but to provide performance enhancements where these RISC type CPUs excel.

cmaier · Jun 12, 2020

efluxdesign said:
I don't think you really know what you're talking about. There's more to it than just "compiler" technology and as I've mentioned previously, a lot of x86 chips and RISC designs available, do kind of combine a bit of both traditional "CISC" and "RISC" design elements though clearly ARM and x86 aren't the same instruction sets.

The two of you replying to me previously, are going on "half" the information, and speaking half the truth. There is an advantage to doing more per cycle - how could there not be? The difference really is if X amount of cycles, can complete before a single cycle processing the same chunk of data.

There are some really cool power-saving functions and capabilities found in these latest ARM designs, and in specific scenarios they seem like amazing little performers. But you can clearly see, across the board (not paying attention to power consumption) x86 platforms are still capable of providing more raw performance. In large parts, due to the CISC architecture, which provides complex instruction sets for saving cycles. Saving/reducing cycles, clock-for-clock, saves processing time - and in which case, the "next" clock cycle(s) can move onto the next task.

So again ... SOMETIMES, for simple tasks, the sheer relative simplicity of a RISC design can provide faster execution of simple calculations. I really think it (quite clearly) depends on the usage case. It would seem for software you often find running on your average MacBook Air, and iPads - those kinds of tasks are well suited to the ARM CPU's.

You can't find benchmarks anywhere online, where one of these ARM chips comparatively kicks the crap out of x86 except for in very niche tasks. Go look for yourself ...

I think what Apple is doing, such as with the AfterBUrner card, the T2 chip, the "Biometrics" co-processors etc. is a very smart usage of their ARM based processor designs. Creating specific designs for example, ARM based chips to accelerate video encoding and decoding, is a very promising looking solution to not only harness the existing software, but to provide performance enhancements where these RISC type CPUs excel.

I don’t know what I’m talking about? I’ve designed both RISC and CISC cpus, including powerpcs, and the first X86-64 chips. You are simply wrong.

eflx · Jun 12, 2020

Sure, I can concede if you've designed the CPU's directly. That seems to go against conventional wisdom as far as I can tell. Who are you exactly then, and what CPU's have you designed?

If that's not the case, and Apple was first out of the gate using RISC based CPU's, why would they switch to an inferior x86 CISC based platform? Since when does Apple not use what they see best ...

cmaier · Jun 12, 2020

efluxdesign said:
Sure, I can concede if you've designed the CPU's directly. That seems to go against conventional wisdom as far as I can tell. Who are you exactly then, and what CPU's have you designed?

If that's not the case, and Apple was first out of the gate using RISC based CPU's, why would they switch to an inferior x86 CISC based platform? Since when does Apple not use what they see best ...

I don‘t like to give out my name because it leads to harassment, but people on here know who I am.

I designed the floating point unit and interface circuitry/logic on the PowerPC x704 for Exponential Technology, then I, for a time, owned the out-of-order issuing hardware on the UltraSparc V for Sun. At AMD I designed a lot of stuff - I worked on K6-II, K6-II+, K6-III, K6-III+, a little bit on Athlon, a huge chunk of Athlon 64 and Opteron, etc. At various times I did logic design, circuit design, global design, I was in charge of design methodology, I did the first pass ISA for 64-bit integer instructions for x86-64 (we called it AMD64), etc.

As for your second paragraph - performance is not just a matter of RISC vs. CISC. You could have the best architecture in the world, but if your logic design sucks, or your circuit design sucks, or your fab sucks, what good is it?

PowerPC suffered from poor fabs and mediocre logic design (though we did pretty well at Exponential). x86 had a performance/watt advantage largely because Intel had the world’s best fabs, by a long shot. But if Intel fabbed IBM’s PowerPC designs, they’d have beaten x86 by a lot.

Hell, at the time the DEC Alpha was the best chip in the world - and it was RISC.

eflx · Jun 12, 2020

Very cool, that sounds like a hell of a fun list of projects you were involved in! Interesting - so you in general what you're saying is a properly/well designed RISC chip mated with the proper platform (not just software & compilers, but the entire hardware platform surrounding the processor), and great fabrication, will whup x86.

Why in your opinion has that not happened then since RISC designs have been around for a hell of a long time now? Why wouldn't a massive company like Intel who tried their hand at their own successor to x86, stick with x86 and continue to build on it and drop the other projects. Purely market & software library compatibility / demand?

cmaier · Jun 12, 2020

efluxdesign said:
Sure, I can concede if you've designed the CPU's directly. That seems to go against conventional wisdom as far as I can tell. Who are you exactly then, and what CPU's have you designed?

If that's not the case, and Apple was first out of the gate using RISC based CPU's, why would they switch to an inferior x86 CISC based platform? Since when does Apple not use what they see best ...

efluxdesign said:
Very cool, that sounds like a hell of a fun list of projects you were involved in! Interesting - so you in general what you're saying is a properly/well designed RISC chip mated with the proper platform (not just software & compilers, but the entire hardware platform surrounding the processor), and great fabrication, will whup x86.

Why in your opinion has that not happened then since RISC designs have been around for a hell of a long time now? Why wouldn't a massive company like Intel who tried their hand at their own successor to x86, stick with x86 and continue to build on it and drop the other projects. Purely market & software library compatibility / demand?

Your premise is incorrect - it’s happened many times.

In 1996, at RPI, we designed and fabbed a RISC processor that ran at 1GHz. Years before that clock rate was hit anywhere else.

In the early/mid 1990s, the fastest CPU in the world was the DEC Alpha. It whupped intel. UltraSparc, RS-6000, PA-RISC, and even SGI’s MIPS chips, at one point or another, were faster than anything from Intel.

But to get 90% of the world to switch architectures when Microsoft wasn’t really fully on board was not going to happen, especially when these chips were very expensive (low volumes, sometimes exotic fab technology, and made by companies for their own workstations which they wanted to sell for A lot of money).

Apple’s about to show everyone that it can be done, once more.

eflx · Jun 12, 2020

Interesting - well I shall sit back, shut up and watch! I'm all for progress and better designed systems. I don't think anyone can argue with that.

Looking forward to seeing what's next

Coconut Bean · Jun 12, 2020

CarlJ said:
And keep in mind that those are chips specifically engineered to run in an environment where the machine is 1/4" thick, has no active cooling, and runs on batteries basically 100% of the time.

For an Arm Mac, those limits can come off. Apple has the in-house smarts to optimize their CPU designs for whatever need is at hand - there's nothing stopping them from making a laptop/desktop chip with 4x (or more) the number of cores running at 2x (or more) the clock speed. They absolutely do not have to "settle" for putting iPhone/iPad CPUs in their Arm MacBooks.

There is this misconception that if only those chips had the cooling, they could replace even a macbook pro. But scaling is not that simple, just ask Intel and Pentium 4.

Even when it comes to benchmarks there are some caveats...

First of all a lot of the benchmarks just don’t run on ipad (or ARM in general) and to be honest they could simulate but it would be very slow... ARM is good at the stuff it does, much better than x86, but for everything x86 does that ARM doesn’t it is pretty much years behind. Of course much could be integrated with dedicated hardware, but then die size increases and so does costs, temperature etc.

Second there is the thermal issue where many benchmarks will finish before sustained load is achieved.

I find it unlikely Apple will release an ARM based mac to market anytime soon, maybe end of 2021, but could ofc be very wrong. If they do release something I’d guess it will be old Macbook-like and might even run some iOS/Mac blend, both as OS but ”philosophy-wise” as well. No real, accessible file system, no tb3, app store only.

Unregistered 4U · Jun 12, 2020

Coconut Bean said:
But scaling is not that simple, just ask Intel and Pentium 4.

I‘d more say that scaling a particular design depends on how well it was thought out. Pentium 4 came AFTER Pentium 3, but what we found is the Pentium 3 design scales much better. Intel went BACK to an older, but more well executed design to drive their post Pentium 4 processors. And those scaled well for some time.

SO, scaling is actually easy as long as your processor has been designed well. As this is not a “last minute” decision by Apple and actually has been in the works for awhile, I’d say that scaling is likely the least of their problems.

cmaier · Jun 12, 2020

Unregistered 4U said:
I‘d more say that scaling a particular design depends on how well it was thought out. Pentium 4 came AFTER Pentium 3, but what we found is the Pentium 3 design scales much better. Intel went BACK to an older, but more well executed design to drive their post Pentium 4 processors. And those scaled well for some time.

SO, scaling is actually easy as long as your processor has been designed well. As this is not a “last minute” decision by Apple and actually has been in the works for awhile, I’d say that scaling is likely the least of their problems.

Yep. When we designed athlon 64, we designed the cell architecture to scale to multiple fabs and process nodes, and designed it from the beginning to have a scalable number of cores (by thinking in terms of a universal bus architecture - hyper transport - and by designing it from the beginning so that the "core" was a single object (named "core0"), to prevent us having to rejigger the hierarchy in the future when we added the second (and third and fourth) cores.

You need to think ahead, but scaling is just another technical problem to solve.

Coconut Bean · Jun 13, 2020

Unregistered 4U said:
I‘d more say that scaling a particular design depends on how well it was thought out. Pentium 4 came AFTER Pentium 3, but what we found is the Pentium 3 design scales much better. Intel went BACK to an older, but more well executed design to drive their post Pentium 4 processors. And those scaled well for some time.

SO, scaling is actually easy as long as your processor has been designed well. As this is not a “last minute” decision by Apple and actually has been in the works for awhile, I’d say that scaling is likely the least of their problems.

Indeed. I'd think that A-series has been designed to the specific requirements of little required cooling, quick powerful bursts for some tasks and ultra-high power efficiency.

For whatever Apple would put in an ARM MacBook , I think they'd design a new processor for those requirements. Maybe trading off some power and thermal efficiency for better sustained loads.
[automerge]1592033435[/automerge]

cmaier said:
Yep. When we designed athlon 64, we designed the cell architecture to scale to multiple fabs and process nodes, and designed it from the beginning to have a scalable number of cores (by thinking in terms of a universal bus architecture - hyper transport - and by designing it from the beginning so that the "core" was a single object (named "core0"), to prevent us having to rejigger the hierarchy in the future when we added the second (and third and fourth) cores.

You need to think ahead, but scaling is just another technical problem to solve.

How close is ARM when it comes to memory access, pcie or equivalent? And also how important is all the dedicated bells and whistles of x86?

CarlJ · Jun 13, 2020

DanBig said:
Well, that's not really true! This gets into the instruction sets each has and what the given code is doing. Most of the CISC Vs RISC issues had to do with how the chip ran the assembler code, as an example: The way the bits are written and read, so the decimal value 15 is either 00001111 vs 11110000 while we humans think the most significant (higher value) to the left (1 - ten's and 5 ones) to be clear this is really done as the nibble level of four bits, just easer to see at the byte level. CISC chips for the longest time had to buffer the nibble or byte (depending on the generation) to then flip the bit order so the processor could run it. This created an extra step.

You’re ELI5’ing chip design to a chip designer. It’s almost cute.

jekih11935 · Jun 13, 2020

Because most people don't need to use "pummel" often, the RISC chip is, overall, more efficient for day to day use. For pro users, though, the CISC chip may be more the more efficient choice.

lol. This explanation is just plainly wrong.

cmaier · Jun 13, 2020

DanBig said:
Well, that's not really true! This gets into the instruction sets each has and what the given code is doing. Most of the CISC Vs RISC issues had to do with how the chip ran the assembler code, as an example: The way the bits are written and read, so the decimal value 15 is either 00001111 vs 11110000 while we humans think the most significant (higher value) to the left (1 - ten's and 5 ones) to be clear this is really done as the nibble level of four bits, just easer to see at the byte level. CISC chips for the longest time had to buffer the nibble or byte (depending on the generation) to then flip the bit order so the processor could run it. This created an extra step.

Today many of these handicaps have been removed so a CISC chip is more like a RISC chips! And some of the benefits of CISC architecture is now inside RISC chips too! As an example: The bit order I spoke about above was resolved in the ARM design which can run either order! Todays chips are more an amalgam of both architectures.

What you are referring to is called endianness, but you even explained that wrong (it has nothing to do with how numerical values are represented. It has to do with the order than bytes are stored in a word - whether the most significant byte is at the left or right end. It has nothing to do with nibbles. It has nothing to do with how the decimal value 15 is stored - in every machine, including every x86 CISC machine, 15 is stored in the registers and memory with the 1’s on the small end.

It mainly comes up when you are storing ascii in a 32 bit or 64 bit register, for example. And this is NOT a big deal and it has nothing to do with CISC vs. RISC - there are CISC machines that are big-endian and others that are little-Endian, and there are RISC machines that are big Endian and others that are little-Endian. And some CPUs can flip back and forth.

But thanks for incorrectly mansplaining an irrelevant issue to a CPU designer.

DanBig · Jun 13, 2020

cmaier said:
What you are referring to is called endianness, but you even explained that wrong (it has nothing to do with how numerical values are represented. It has to do with the order than bytes are stored in a word - whether the most significant byte is at the left or right end. It has nothing to do with nibbles. It has nothing to do with how the decimal value 15 is stored - in every machine, including every x86 CISC machine, 15 is stored in the registers and memory with the 1’s on the small end.

It mainly comes up when you are storing ascii in a 32 bit or 64 bit register, for example. And this is NOT a big deal and it has nothing to do with CISC vs. RISC - there are CISC machines that are big-endian and others that are little-Endian, and there are RISC machines that are big Endian and others that are little-Endian. And some CPUs can flip back and forth.

But thanks for incorrectly mansplaining an irrelevant issue to a CPU designer.

Well I tried to simplify the issue, But you too are wrong!

The level is at the machine code level not at the word level. The bit order (not numeric) is the issue here. How the CPU digests the input at the nibble or byte level is the issue. Even at newer 32bit and 64bit processors, which work off of word and double word. The issue is at the lower level.

Apple Silicon: The Complete Guide

Suspended

macrumors regular

Suspended

macrumors 6502a

macrumors member

macrumors 68000

macrumors 6502a

macrumors 68040

macrumors 6502

macrumors 6502

macrumors regular

Suspended

macrumors regular

Suspended

macrumors regular

Suspended

macrumors regular

macrumors 6502

macrumors G4

Suspended

macrumors 6502

Contributor

macrumors newbie

Suspended

macrumors 6502

Our Staff