Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Original poster
Oct 14, 2008
19,548
19,734
Just a quick heads up for those few of you that are doing data science in Python or R.

It seems that a big roadblock for having relevant software running natively on M1 is the Fortran compiler — many classic scientific algorithms are implemented in Fortran and there is no way to resolve this issue until a compiler capable of targeting ARM64 Darwin is available. The LLVM Fortran compiler is currently in very early stages and non functional, GCC suite has to be patched (and it's not clear when this will happen). Unfortunately, Fortran is niche enough that it didn't land on Apple's radar of open source patches. The maintainers of R have stated that they expect to release a native version in April 2021.

Good news however: this stuff works under Rosetta 2, and it is fast enough. Brian Ripley (one of core R developers) reported that running a full test of R installation on an M1 MBA takes 454 seconds. In comparison, the same operation takes 604 seconds on my i9 16" machine. So if you are using an Intel 13" MacBook Pro and you do data science, M1 could be a considerable upgrade, and it will get another big boost in spring.
 
Hey @leman great post thanks I came here specifically to start a thread to ask about this 😅

Do you know if anyone has tried Stan under Rosetta2 on it yet? Stan works under Big Sur but so far don't know about Apple Silicon.
 
Hey @leman great post thanks I came here specifically to start a thread to ask about this 😅

Do you know if anyone has tried Stan under Rosetta2 on it yet? Stan works under Big Sur but so far don't know about Apple Silicon.

I will try stan when my MBP arrives, most likely on Monday. I’d expect it to run natively out of the box since it works on ARM Linux.
 
  • Like
Reactions: The Mercurian
I will try stan when my MBP arrives, most likely on Monday. I’d expect it to run natively out of the box since it works on ARM Linux.
Great thanks. I'm slightly tempted to get the MacMini one as a test rig for this - but I need to sell one of my existing machines first to justify such a purchase 😅
 
There are Fortran compilers able to target Arm 64. A considerable number of supercomputers are Arm64 based , including the current number 1 of the TOP500.

It’s not just about targeting ARM, it’s about targeting Apple platforms. There are some ABI differences etc.
 
Are Apple CPUs Arm64 or not?

Of course they are. But architecture and platform ABI are different things. GCC is perfectly able to emit ARM64 code, but it does not know how to package that code into a valid application. @Gnattu has posted a link to a GCC fork targeting Apple ARM64. Here is some relevant discussion: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96168

On the other hand LLVM has all the machinery necessary to support Apple's ARM64, but its fortran compiler is an early work in progress. Frankly, I'd love to see more progress here because that would allow me to just remove GCC from my setup altogether.
 
@Gnattu don't really need stable GCC, just a functioning Fortran would be nice :)

You would think that ARM would actually be interested in porting theirs. :)



One of those cases where if X-Windows was available and the "unix" subsystem was up to snuff that they'd get more solutions faster on macOS 11. But Apple isn't really a big open source contributor as they should be. ( at their peak AT&T had Bell Labs contributing back , iBM Watson (and other R&D labs0 contributing back , Xerox had PARC (chucks wouldn't even have Mac... ) meanwhile Apple borrows 10's of billions to pay dividends in a concerted effort to duck taxes. ). Apple contributes almost just what they have to.


P.S. the other curious odd ball is PGI ... ( owned by Nvidia which is trying to buy ARM ). So that is two compilers on track to be owned by same overall company and neither one is in the loop here.
 
Thanks for the heads up. Apparently a Fortran compiler is need by R for the BLAS and LAPACK libraries (+ other things I suppose). But you may also use those provided by accelerate.framework.

I think on Apple Silicon it will make more sense to use the Accelerate framework anyway as it might give you access to hardware that Apple doesn't expose directly (like the matrix multiplies on the CPU or the Neural Engine). But there are some R packages that rely on Fortran being present. As much as one would want to eliminate Fortran, it's not viable at this point :)
 
I think on Apple Silicon it will make more sense to use the Accelerate framework anyway as it might give you access to hardware that Apple doesn't expose directly (like the matrix multiplies on the CPU or the Neural Engine).
Do you know if Accelerate uses the special hardware of Apple SoCs? Apple's website isn't very explicit.
 
Do you know if Accelerate uses the special hardware of Apple SoCs? Apple's website isn't very explicit.

Pretty explicit what they are going to do.

"...Accelerate provides high-performance, energy-efficient computation on the CPU by leveraging its vector-processing capability. The following Accelerate libraries abstract that capability so that code written for them executes appropriate instructions for the processor available at runtime: .."

The label "appropriate instructions" is pretty explicit. If there is a faster proprietary way of doing the calculation ... they will use it in situ for that context. If the Apple Matrix extensions are present and they aren't using them, that would be a deviation from what is stated there.

They aren't going to provide a very detailed blow by blow breakdown because they "abstract that capability".

A important issue is the one raised in that article about use of Accelarate with R in that Apple may or may not be primarily concerned about numerical stability and perhaps more concerned about faster 'answers".
 
  • Like
Reactions: jerryk
Do you know if Accelerate uses the special hardware of Apple SoCs? Apple's website isn't very explicit.

I think @deconstruct60 sums it up very well. I would just like to add that Apple's numeric implementations are usually of excellent quality. In my experience at least their code is ridiculously fast while also providing very accurate results.
 
  • Like
Reactions: jerryk
@leman - did your machine ever arrive? I'm curious how it stacks up vs your i9!

It's here and I did some preliminary testing, but these last two days I was a bit busy with my day job :) I'l do more testing on the weekend.

The initial impression is that these new Macs are going to be absolutely brilliant for data science. Even under Rosetta, the M1 is about 10% faster than my i9 in a small selection of R scripts I tried out (some regression, data wrangling and simple probabilistic simulations). With native version of R, the gap goes up to 30-40%, which is absolutely insane for this thin and light machine.

The software support is not quite there yet, but I was able to build most of things I need after some little tweaks here and there. Rosetta works well enough for now however. CmdStan (patched git version) builds without problems. Haven't looked at Stan performance yet, but the few quick tests I did suggest that its very good.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.