What’s the point of ARM?

Sydde · Jan 18, 2023

ADGrant said:
ARM was originally designed in the early 80s as a desktop CPU

from Ars, A History of ARM, Part 1, Part 2, Part 3

senttoschool · Jan 18, 2023

About me: Tech lead for a top Silicon Valley software company. I manage a team of software developers and a 7 figure cloud bill.

mr_roboto said:
- they're doing compute on large locally-generated datasets, and it is expensive and slow to upload them to the cloud (I have personally seen this)

What year is this? Simply no. Just no. This is not how modern big data is done.

I've met, chatted, and interviewed hundreds of data engineers and data scientists in Silicon Valley. None of them do things like you mentioned.

In modern big data, you don't generate a large dataset locally that is so large that it can't be reasonably uploaded to the cloud. No one does this. Hell. Forget modern. In my 20 years of software engineering experience, I've never seen someone dumb enough to get into this situation.

Instead, what modern teams do is generate, manipulate, and process data completely in the cloud. No data touches the local machine except maybe small sample data.

It's called ETL. It's all done via the cloud.

Local computers simply aren't used to store or generate data. That's insanity. Big data is done in a team. How are your teammates going to use your data if you can't even upload it to the cloud?

PS. Bandwidth is cheap. You're wrong that it's "expensive". It's the cheapest thing in this process.

mr_roboto said:
- they use lots of cloud compute and storage, get the bills, and over time realize that buying their own hardware would have saved them lots of money (I have personally seen this)

First, we're talking about "real" projects, right? Not hobby projects.

This is super funny. It's funny because this is the main argument someone who doesn't know much about running a high-availability software service makes.

Are there niche cases where buying local hardware is cheaper using the cloud? Yes. Maybe 1/1000 times.

Only extremely large companies or companies with special needs would ever build their own data centers.

mr_roboto said:
- science runs on grant money, and when they get a lot of cash to spend on a research program, lots of scientists love to buy themselves a flashy computer to run their simulations on, and what's flashier than Apple hardware? (I have personally seen this)

I'm sorry. This is not a valid argument. I don't even know how to respond to this.

mr_roboto said:
You're out of touch if you think cloud is guaranteed cheaper.

I never said the cloud is guaranteed cheaper.

@mr_roboto It seems like you're trying to appeal to an anecdotal evidence fallacy.

Xiao_Xi · Jan 18, 2023

Sydde said:
One example would be something like
EOR R8, R9, R7, ASR#34
To fully replicate that operation on x86, you would have to do something like
push r7
asr r7,#34
mov r8, r9
eor r8, r7
pop r7

I completely forgot that ARM uses load-store architecture and x86, register-memory architecture.

I suppose that in real examples this difference is not so big. In fact, in this example from "RISC-V Reader: An Open Architecture Atlas", the x86_64 code has one instruction less than the arm64 code.

senttoschool said:
I never said the cloud is guaranteed cheaper.

~~The cloud~~ On-premise is usually cheaper when computing needs are stable and predictable. In practice, very few companies have such computing needs. What company does not gain/lose customers? What company does not hire/fire personnel? The situation of companies can change very quickly and it is much easier to adapt using the cloud.

Update: @ADGrant has corrected me in a later comment.

TechnoMonk · Jan 19, 2023

senttoschool said:
This is what I've been arguing for the whole time.

It makes little economic sense to buy a 1TB RAM Mac Pro for local work that can be done faster and cheaper via the cloud.

I'm going to mark you as someone who agrees with me.

Cloud is more expensive for workstations, if the usage is more than 6 months. Security and access in cloud is still a big problem unless you have money to through around for private networks with cloud providers. Google has access to most of the high end GPU servers, I dont want some one snooping on my stuff.

TechnoMonk · Jan 19, 2023

senttoschool said:
About me: Tech lead for a top Silicon Valley software company. I manage a team of software developers and a 7 figure cloud bill.

What year is this? Simply no. Just no. This is not how modern big data is done.

I've met, chatted, and interviewed hundreds of data engineers and data scientists in Silicon Valley. None of them do things like you mentioned.

In modern big data, you don't generate a large dataset locally that is so large that it can't be reasonably uploaded to the cloud. No one does this. Hell. Forget modern. In my 20 years of software engineering experience, I've never seen someone dumb enough to get into this situation.

Instead, what modern teams do is generate, manipulate, and process data completely in the cloud. No data touches the local machine except maybe small sample data.

It's called ETL. It's all done via the cloud.

Local computers simply aren't used to store or generate data. That's insanity. Big data is done in a team. How are your teammates going to use your data if you can't even upload it to the cloud?

PS. Bandwidth is cheap. You're wrong that it's "expensive". It's the cheapest thing in this process.

First, we're talking about "real" projects, right? Not hobby projects.

This is super funny. It's funny because this is the main argument someone who doesn't know much about running a high-availability software service makes.

Are there niche cases where buying local hardware is cheaper using the cloud? Yes. Maybe 1/1000 times.

Only extremely large companies or companies with special needs would ever build their own data centers.

I'm sorry. This is not a valid argument. I don't even know how to respond to this.

I never said the cloud is guaranteed cheaper.

@mr_roboto It seems like you're trying to appeal to an anecdotal evidence fallacy.

Way to genralize stuff. There is lot more than ETL processing. When you look at CV, Speech, and other AI stuff. Training happens in cloud on big clusters of A100. Lot of the inference on trained model happens locally on workstations with A5000 or 4090. Try renting a GPU for A 5000 or 4090 in cloud vs a work station. There is no one size fits solution, each approach has pros and cons.

Xiao_Xi · Jan 19, 2023

TechnoMonk said:
Security and access in cloud is still a big problem unless you have money to through around for private networks with cloud providers. Google has access to most of the high end GPU servers, I dont want some one snooping on my stuff.

Are you worried about anything in particular? How do you think CSP can know what you have on their computers?

TechnoMonk said:
Lot of the inference on trained model happens locally on workstations with A5000 or 4090. Try renting a GPU for A 5000 or 4090 in cloud vs a work station. There is no one size fits solution, each approach has pros and cons.

Inference often involves unpredictable computing needs, and the cloud is much cheaper when the workload is unpredictable. Can you elaborate a bit?

TechnoMonk · Jan 19, 2023

Xiao_Xi said:
Are you worried about anything in particular? How do you think CSP can know what you have on their computers?

Inference often involves unpredictable computing needs, and the cloud is much cheaper when the workload is unpredictable. Can you elaborate a bit?

Cloud makes more sense for training but can bevery expensive with anything GPU for inference. Look at the Google Cloud pricing; the V100 cloud cost is around 2K per month. A comparable A 5000 GPU costs around 1.5 to 2K for purchase and allows adding multiple A5000S in a single workstation. Google terminates some of the GPU instances if trying to pause/stop. Azure and AWS are priced much higher and worse than Google regarding GPU. The cheaper options are usually in Eastern Europe in some guy's basement, renting the GPUs.
A workstation with two A 5000 will cost around 10K, and the cloud with a similar GPU will be 4k monthly on GCP.

Xiao_Xi · Jan 19, 2023

TechnoMonk said:
Cloud makes more sense for training but can bevery expensive with anything GPU for inference. Look at the Google Cloud pricing; the V100 cloud cost is around 2K per month. A comparable A 5000 GPU costs around 1.5 to 2K for purchase and allows adding multiple A5000S in a single workstation. Google terminates some of the GPU instances if trying to pause/stop. Azure and AWS are priced much higher and worse than Google regarding GPU. The cheaper options are usually in Eastern Europe in some guy's basement, renting the GPUs.
A workstation with two A 5000 will cost around 10K, and the cloud with a similar GPU will be 4k monthly on GCP.

What kind of inference workload is so computationally intensive and predictable?

By the way, what type of pricing do you use: reserved instance or on-demand instance? Sorry for using AWS nomenclature, I'm not sure if there is a standard nomenclature for this.

senttoschool · Jan 19, 2023

TechnoMonk said:
Way to genralize stuff. There is lot more than ETL processing. When you look at CV, Speech, and other AI stuff. Training happens in cloud on big clusters of A100. Lot of the inference on trained model happens locally on workstations with A5000 or 4090. Try renting a GPU for A 5000 or 4090 in cloud vs a work station. There is no one size fits solution, each approach has pros and cons.

What we're talking about here is $40k Mac pro workstations vs renting in the cloud.

I don't see how generalizing is wrong if the vast majority of big data is done this way.

Everything has pros and cons. Even the best solution has them.

senttoschool · Jan 19, 2023

TechnoMonk said:
Cloud makes more sense for training but can bevery expensive with anything GPU for inference. Look at the Google Cloud pricing; the V100 cloud cost is around 2K per month. A comparable A 5000 GPU costs around 1.5 to 2K for purchase and allows adding multiple A5000S in a single workstation. Google terminates some of the GPU instances if trying to pause/stop. Azure and AWS are priced much higher and worse than Google regarding GPU. The cheaper options are usually in Eastern Europe in some guy's basement, renting the GPUs.
A workstation with two A 5000 will cost around 10K, and the cloud with a similar GPU will be 4k monthly on GCP.

You're talking about apples vs oranges.

You can buy a celeron on eBay for $10 and it'll match the speed of a $30/month EC2 instance. So what?

What about setup, maintenance, automated backups, electricity, cooling, auto-scaling, clustering, bandwidth, SLA, the ability to access this instance while you're working from the beach, and the ability to instantly switch to faster graphics cards/CPUs/SSDs/RAM without losing your initial investment cost?

If one of your requirements is providing inference as a service to your customers, how are your customers going to use your desktop? Are you going to ship a computer to each of your customers?

And what happens if you buy a $40k workstation and then one year later, it's completely obsolete because data is growing exponentially? What are you going to do with your shiny $40k workstation that is now completely useless for its intended purpose? Invest in an $80k workstation?

Also, you're literally calculating the cost assuming that the cloud instances are running 24/7. Anyone worth his salt would program the instance to shut down when it's not in use so the true cost for a project is probably vastly closer than you think.

senttoschool · Jan 19, 2023

Gentlemen, if you're working on a small-scale project, amateur project, or hobby, you're going to save money by using the hardware you already own or buying some cheap consumer/prosumer-grade products over renting from the cloud. Yes. No one is going to argue against this.

And yes, there are cases where businesses could save money by owning. For example, my company owns a few DGX systems from Nvidia.

But the original goalpost was buying a Mac Pro with 1TB of RAM for "science simulations". This Mac Pro would cost $40k if you buy it directly from Apple.

I can't see how it's smart to do that. And if you don't agree with me, make another thread and invite me.

Let's not hijack this thread any longer.

TechnoMonk · Jan 19, 2023

senttoschool said:
What we're talking about here is $40k Mac pro workstations vs renting in the cloud.

I don't see how generalizing is wrong if the vast majority of big data is done this way.

Everything has pros and cons. Even the best solution has them.

And its going to cost around 6K per month for a similar 40K Mac Pro in cloud with 1 TB RAM. If the compute is predictable for a local workstation, its usually cheaper than Cloud.

TechnoMonk · Jan 19, 2023

senttoschool said:
What about setup, maintenance, automated backups, electricity, cooling, auto-scaling, clustering, bandwidth, SLA, the ability to access this instance while you're working from the beach, and all the other benefits?

And what happens if you buy a $40k workstation and then one year later, it's completely obsolete because data is growing exponentially? What are you going to do with your shiny $40k workstation that is now completely useless for its intended purpose?

Who is doing all that with a workstation? Last I checked, a Mac Pro doesn't do any of the stuff you mentioned. If the company is spending 6-8K per month on a workstation for a guy to work from the beach, there is something fundamentally wrong.
It makes sense if you are talking about servers with SLA, Autoscaling, deployment, and maintenance. I hope you know the difference between a workstation and a server. Only some things run on clusters and servers. Large scale with local compute-intensive workflows has workhorses to get the job done. The full cloud makes sense if you are using mobile laptops and small clients as consumption devices with compute on cloud servers.

TechnoMonk · Jan 19, 2023

Xiao_Xi said:
What kind of inference workload is so computationally intensive and predictable?

By the way, what type of pricing do you use: reserved instance or on-demand instance? Sorry for using AWS nomenclature, I'm not sure if there is a standard nomenclature for this.

Speech, CV, sensors, and graphics. Spot pricing is 50-60% cheaper, but your instance will be stopped under load. On demand Pricing can get very expensive. Reserved is long term commitment, which defeats the purpose of the adaptability. I know folks who have commited to V100 for 3 years, which is worse than a workstation GPU currently.

Cloud isn't one size fits all, do what is needed to get work done at a reasonable cost.

JouniS · Jan 19, 2023

senttoschool said:
In modern big data, you don't generate a large dataset locally that is so large that it can't be reasonably uploaded to the cloud. No one does this.

That is common with scientific data. The data comes from local instruments, and because the internet is slow and/or expensive, it must be processed locally.

senttoschool said:
PS. Bandwidth is cheap. You're wrong that it's "expensive". It's the cheapest thing in this process.

Bandwidth is expensive when someone is downloading your data from the cloud. That's a deliberate choice by cloud providers in an attempt to lock the customers in. Which doesn't work well with scientific work, which is often a collaboration between multiple organizations. Each organization has their own infrastructure, so even moderate-sized projects may have to deal with multiple cloud providers and local/national clusters/supercomputers.

senttoschool · Jan 19, 2023

JouniS said:
That is common with scientific data. The data comes from local instruments, and because the internet is slow and/or expensive, it must be processed locally.

What local instrument produces so much data that it can't be reasonably uploaded to the cloud and must be processed locally? Examples?

And is a $40k Mac Pro the right machine to process this data?

JouniS said:
Bandwidth is expensive when someone is downloading your data from the cloud. That's a deliberate choice by cloud providers in an attempt to lock the customers in. Which doesn't work well with scientific work, which is often a collaboration between multiple organizations. Each organization has their own infrastructure, so even moderate-sized projects may have to deal with multiple cloud providers and local/national clusters/supercomputers.

Wait what?

Bandwidth is cheap as dirt.

AWS Cloudfront gives you 1TB of transfer for free. After that, it costs as little as $0.02/GB. If you use S3 and you're transferring between different AWS services, it can cost as little as $0.01/GB. I assume that the chances of different organizations using AWS is high since well, practically everyone uses AWS.

If you're working with TBs of data, bandwidth budget is the lowest cost during the process.

leman · Jan 19, 2023

Philip Turner said:
Apple also has humongous instruction caches (192 KB) because ARM instructions have greater binary size.

Experimental results show that Aarch64 is actually more dense and uses fewer instructions on average than x86-64

Debunking CISC vs RISC code density – Bits'n'Bites

www.bitsnbites.eu

JouniS · Jan 19, 2023

senttoschool said:
What local instrument produces so much data that it can't be reasonably uploaded to the cloud and must be processed locally? Examples?

In my field, a sequencing machine may produce a burst of a few terabytes every couple of days. A single facility may operate tens of such machines. While it's possible to upload the data to the cloud (I think the Broad Institute does that), local processing is quite attractive with such data.

In astronomy, the instrument may be at a remote location where fast internet connections are not available. And in marine sciences, the instrument may be on a ship.

senttoschool said:
And is a $40k Mac Pro the right machine to process this data?

It's a $15k to $20k Mac Pro. And it may be appropriate for the people who develop methods and tools for processing the data.

senttoschool said:
AWS Cloudfront gives you 1TB of transfer for free. After that, it costs as little as $0.02/GB. If you use S3 and you're transferring between different AWS services, it can cost as little as $0.01/GB. I assume that the chances of different organizations using AWS is high since well, practically everyone uses AWS.

The price can be several times higher for downloads outside Europe and North America or if the transfer volume is less than 5 TB/month. For popular data resources, transfer fees are generally higher than storage costs.

The lab I work at has used three major cloud providers and several local and remote HPC clusters. It depends on a variety of things, including collaborators, available funding, and regulations. Science is generally chaotic like that. Instead of big centrally planned organizations, there are thousands of small labs trying to do their own things.

Sydde · Jan 19, 2023

Xiao_Xi said:
I completely forgot that ARM uses load-store architecture and x86, register-memory architecture.

Most likely you forgot that because x86-64 has register-memory architecture but mostly does not use it in programs. The feature is available, but its use in typical code is, I suspect, comparatively low, because register-register math is more efficient, and the 64-bit achitecture with its 8 additional registers makes it more practical. Most x86 object code is probably structurally similar to ARM code, because that way is more efficient. Adding a register value directly to a memory location made a lot of sense in 1980, today not so much.

So, realistically, many x86-64 programs will look a lot like ARMv8 programs in object code because that is the more efficient way to run. I have seen the argument that x86 offer large (32 or 64 bit) memory address offsets that require extra work in ARM, but, so what? ARM supports 12 bit offsets, which looks very small by comparison, but very, very few data structures are heterogenous even up to 1K, so the advantage to large offsets is all but non-existent.

Then there is the issue of large immediates, which ARM does not support. But I see that as a positive: if you have large constants, they should live in a code-adjacent table where they are easier to observe and maintain, which ARM does support, rather than embedded in code.

The advantages of ARM are tiny, but on aggregate they add up. And the trend lines suggest that they will keep adding up.

Unregistered 4U · Jan 19, 2023

Sydde said:
The advantages of ARM are tiny, but on aggregate they add up. And the trend lines suggest that they will keep adding up.

And I read somewhere that Apple’s license only requires they support the full ARM instruction set. It doesn’t say that they can’t add specific instructions that would just be for Apple’s internal use.

Sydde · Jan 19, 2023

Unregistered 4U said:
And I read somewhere that Apple’s license only requires they support the full ARM instruction set. It doesn’t say that they can’t add specific instructions that would just be for Apple’s internal use.

It may go beyond that. There was a piece that tried to reverse-engineer the M1. One of the things they found was that registers showed what would seem like unusual behavior: if the program had not previously used a register, accessing its value would take an extra clock cycle. This strongly suggests that there is not a fixed register file but that register only exist as objects in the register rename pool. Which makes a lot of sense, because most registers are nominally transient values. This could also mean that they have developed some kind of mechanism that allows a thread swap context frame to bypass saving/restoring registers that have not been used, which could add up (there are 32 FP/Vector register 128 bits wide – if they are not all in use, why save/load all the unused ones).

Then there is Cocoa/Swift: most likely the M-series has features that effectively erase the cost of object-method call boundaries. Apple knows what is inside their processors, unlike everybody else.

ADGrant · Jan 19, 2023

senttoschool said:
Take for example, all the features of Apple's T2 chip is now entirely inside an Apple Silicon chip. The T2 chip is now way out of date for a modern macOS experience. This means if Apple were to use AMD + Nvidia chips, they'd have to engineer and build a brand new T3 chip just to provide basic macOS features to the Mac Pro. No way.

The T2 chip is basically just a modified A10. Apple could just take the A13 and use that instead (which I believe they did in the Studio monitor).

mr_roboto · Jan 20, 2023

senttoschool said:
What year is this? Simply no. Just no. This is not how modern big data is done.

I've met, chatted, and interviewed hundreds of data engineers and data scientists in Silicon Valley. None of them do things like you mentioned.

In modern big data, you don't generate a large dataset locally that is so large that it can't be reasonably uploaded to the cloud. No one does this. Hell. Forget modern. In my 20 years of software engineering experience, I've never seen someone dumb enough to get into this situation.

Instead, what modern teams do is generate, manipulate, and process data completely in the cloud. No data touches the local machine except maybe small sample data.

It's called ETL. It's all done via the cloud.

If you think data sets can always just be poofed into existence on a cloud compute server, well, sure, I guess I see your point.

In the sciences we often have to deal with data collected locally. In some fields, these datasets are simply far too large for it to be practical, economical, or timely to upload them to cloud compute servers for processing.

senttoschool said:
Local computers simply aren't used to store or generate data. That's insanity. Big data is done in a team. How are your teammates going to use your data if you can't even upload it to the cloud?

Your hyperfocus on "big data" is telling. That's a buzzword from a certain segment of the tech industry. Makes me think your experience base is narrow, and you aren't fully aware of it.

senttoschool said:
PS. Bandwidth is cheap. You're wrong that it's "expensive". It's the cheapest thing in this process.

If you start needing bandwidth on the scale of 100 Gbps, you will run into some serious bills, and as @JouniS mentioned, sometimes you have to deploy to locations where there is no practical way to get that level.

senttoschool said:
First, we're talking about "real" projects, right? Not hobby projects.

This is super funny. It's funny because this is the main argument someone who doesn't know much about running a high-availability software service makes.

Are there niche cases where buying local hardware is cheaper using the cloud? Yes. Maybe 1/1000 times.

Only extremely large companies or companies with special needs would ever build their own data centers.

I'm sorry. This is not a valid argument. I don't even know how to respond to this.

I never said the cloud is guaranteed cheaper.

@mr_roboto It seems like you're trying to appeal to an anecdotal evidence fallacy.

I assure you that my employer is not paying me as a hobby. I'd rather not give specifics, so I'm going to leave it at that.

You mention "special needs". I get the impression that, to you, "special" is anything which doesn't match your narrow domain expertise, and you've fallaciously decided that everything outside of this experience base must be tiny and insignificant, so there's no possible reason why a scientist would buy a powerful workstation when they could just do it in the cloud.

The real world doesn't work that way. Cloud compute is not a panacea.

ADGrant · Jan 20, 2023

Xiao_Xi said:
The cloud is usually cheaper when computing needs are stable and predictable. In practice, very few companies have such computing needs. What company does not gain/lose customers? What company does not hire/fire personnel? The situation of companies can change very quickly and it is much easier to adapt using the cloud.

You have it the wrong way round. Cloud is cheaper when computing needs are not stable and predictable because cloud usage can easily scaled up and down with workload.

TinyMito · Jan 20, 2023

Zest28 said:
For the Mac Pro, why not simply put a 96-core EPYC AMD CPU in it with a RTX 4090 (if Apple can solve their politics with NVIDIA) while retaining user expandability and repairability for the Mac Pro. Since Mac Pro usually supports dual chips, Apple could even put a 192-core AMD CPU in it even.

Does Apple really believe the M2 Extreme would beat a 192-core AMD CPU and a RTX 4090? Heck, you can probably put multiple RTX 4090 in the Mac Pro even (if Apple solves their politics with NVIDIA).

For laptops, I get it. ARM offers nice battery life, but a Mac Pro has no battery life.

Just get a PC.

ARM has a benefit, you can run iOS apps on it. It's more power efficiency.

What’s the point of ARM?

macrumors 68030

macrumors 68030

macrumors 68000

macrumors 68040

macrumors 68040

macrumors 68000

macrumors 68040

macrumors 68000

macrumors 68030

macrumors 68030

macrumors 68030

macrumors 68040

macrumors 68040

macrumors 68040

macrumors 6502a

macrumors 68030

macrumors Core

macrumors 6502a

macrumors 68030

macrumors G4

macrumors 68030

macrumors 68000

macrumors 6502a

macrumors 68000

macrumors 6502a

Our Staff