Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

chars1ub0w

macrumors regular
Original poster
Jun 5, 2017
164
70
Here, there and over there
Sometimes I'm not so impressed with my M1 Max MBP 16" (64GB RAM). Sometimes, it seems barely faster than the Core i9 15" (2018) MBP (32 GB RAM) that I also have. I leave one at work and keep one at home, so I rarely compare them side by side. Sure, somethings are perceptively faster, e.g. some Photoshop tasks, but for general tasks, I'm not so sure. So I ran some code that I use for teaching non-determinism. And I'm surprised. The M1 Max and the MBP ran neck and neck on this Perl one-liner. I checked. The Perl executable for the Max is arm64. Perl for the i9 is x86. You can run the same code easily.
perl.jpg
 
I think that was an Intel 8th gen, about the same speed or very slightly slower than later generations through 11th.

Anyway, I'm not terribly surprised. Single-core performance of the M1 is somewhat better than an i9-8950HK on average but I imagine you can always find something that the Intel CPU happens to like better than the M1. It might also be related to compilers and how that particular loop in perl was compiled.
 
Update: on an actual real-world Perl regex benchmark M1 is 60-70% faster than i9 Coffer Lake (single-threaded performance). Details in post #55.



Difficult to tell what's going on here, since you could be measuring pretty much anything. The M1 runs circles around i9 in any R or Python script I've tried so far, it's also significantly faster for compiling code. I don't use perl, am I correct in assuming that /(a?){$n}a{$n}/ expands to a regex matching (a?) n times followed by a n times? When I try to run this script on my machine it does nothing.

FWIW, matching the regex for the sequence of length 100 using R regex takes under 3ms. If I try to use the PCRE backend it crashes with a "match limit exceeded" warning. I think you are just running into a pathological case with the perl regex engine.

1666704842289.png
 
Last edited:
Not all software is going to be made to take proper advantage of an M1/M2, I'd say if the worst you can find is just as good as Intel, it's almost as if it's being bottlenecked by something else they're so close
 
Perl probably uses PRCE or ICU to run those regex, it could easily be that these libraries are not as optimized on arm64 as on intel. There are still so many optimisation opportunities on arm64, it will take time.

But the Core i9 in your MacBook Pro is still a nice and fast processor, it just generates too much heat for that kind of laptop.
 
Perl probably uses PRCE or ICU to run those regex, it could easily be that these libraries are not as optimized on arm64 as on intel. There are still so many optimisation opportunities on arm64, it will take time.

PCRE is fairly well optimised for ARM, including a regex JIT compiler that makes uses of ARM Neon. That is hardly the problem here. This is a pathological regex on pathological data, the only thing it shows here is that PCRE has a problem with this case.
 
Wow, this is a strange regex.

time perl -e 'print "aaaaaaaaaaaaaaaaaaaaaaaaa" =~ /(a?){25}a{25}/'

does indeed take 5 seconds on my 2019 x86 mbp with perl v5.34.1.

The more "a"s you feed it the faster it gets...

python3 -m timeit -r 1 -n 1 'import re; print(re.match("(a?){25}a{25}", "aaaaaaaaaaaaaaaaaaaaaaaaa"))'

takes 1.3 seconds with Python 3.9.6, which is also a lot.

time node -e 'console.log("aaaaaaaaaaaaaaaaaaaaaaaaa".match(/(a?){25}a{25}/));'

with Node v16.17.1 is so fast I cannot measure it with this one-liner (!).

What I also don't get: it looks like the perl one doesn't match (it prints nothing?), whereas Python and Node do print a match. AFAIK (zero times "a") times 25 followed by 25 "a"s should match, right?

Do you know why the Perl one doesn't match?

Anyway, it looks like a big performance regression on some of the RE implementations and not the others.

-- Chris
 
  • Wow
Reactions: bwillwall
As the OP noted, I too don't see a notable performance difference between my (no longer own) 2019 16" MBP with Core i9 16 GBs of RAM and my 16" M1 Max 32 GB system. What I do notice (and enjoy) is the ability to have full performance away form the AC Adapter and not worry about the battery. I also almost never hear the fans in the M1 Max, and the fans on my Intel were on most of the time, as most of my workload engaged the dGPU (even when it wasn't needed). I never actually timed anything, and the only thing I actually expected to notice a big improvement in was rendering videos in iMovie. However, the difference here between the M1 Max and i9 was rather low.

Don't get me wrong, I like my M1 Max system (as noted above I no longer have the i9), and have no desire to go back to my Intel based system, just wanted to offer another opinion/viewpoint that jives with the OP.

Rich S.
 
Difficult to tell what's going on here, since you could be measuring pretty much anything. The M1 runs circles around i9 in any R or Python script I've tried so far, it's also significantly faster for compiling code. I don't use perl, am I correct in assuming that /(a?){$n}a{$n}/ expands to a regex matching (a?) n times followed by a n times? When I try to run this script on my machine it does nothing.

FWIW, matching the regex for the sequence of length 100 using R regex takes under 3ms. If I try to use the PCRE backend it crashes with a "match limit exceeded" warning. I think you are just running into a pathological case with the perl regex engine.

View attachment 2101681
Well, I just posted evidence that the M1 doesn't always run circles around the i9. PCRE or PCRE2 obviously isn't the Perl regex engine.

PCRE aborts backtracking after a (probably hardcoded) maximum number of steps. The Perl regex engine doesn't have that limit.
 
Last edited:
What I also don't get: it looks like the perl one doesn't match (it prints nothing?), whereas Python and Node do print a match. AFAIK (zero times "a") times 25 followed by 25 "a"s should match, right?

Do you know why the Perl one doesn't match?
Good question! It matches. You can do $k = ($na =~ /...pattern/); print $k to verify. However, I think there's a (weird to me) interaction between list context for print and the result of =~ when there's a (..). Example: perl -e '($a, $b) = "AA" =~ /(A?)(A?)/; print "$a $b"'
 
I've been wondering about this question myself—except for me it's how much more responsive an M1 or M2 would feel than my 2019 i9 iMac. Some posters, in comparing late-model high-end Intel Macs with AS Macs for general responsiveness, say they notice a significant difference, while others say they don't notice much change.

At this point, an upgrade doesn't seem to make sense for me—it would cost $4600 to replace my iMac with a Max Studio + ASD, for what would probably be a relatively small increase in performance. When spending that kind of money, I'd want to have a "wow, this is so much faster!" reaction, and it will probably require a couple more years of technological advancement before AS feels like that relative to my iMac.
 
Last edited:
Well, I just posted evidence that the M1 doesn't always run circles around the i9. PCRE or PCRE2 obviously isn't the Perl regex engine.

PCRE aborts backtracking after a (probably hardcoded) maximum number of steps. The Perl regex engine doesn't have that limit.

I don’t really see how this qualifies as evidence. First, I can’t replicate your results on my machine (as I wrote, running your script produces no output and exits immediately). Second, it’s a pathological regex that popular engines have problems with, so it’s really unclear what you are measuring. I mean, I can have a Ferrari compete against a bus on muddy terrain and conclude that neither can move an inch, therefore both have the same speed. I hope you’d agree that this would be quite silly.

There are certainly cases where M1 doesn’t outperform the older i9 model - heavily optimized SIMD throughput code comes to mind as a prime example. But I think it’s crucial that one can explain what one sees and what’s going on. With pretty much any published benchmark, we can explain where the performance delta comes from. We can explain why M1 is generally faster in integer benchmarks and scientific computing than comparable x86 CPUs, and why M1 is slower in benchmarks like Cinebench and Stockfisch. With your test, we can’t explain anything because, well, I can’t even replicate your results…
 
  • Like
Reactions: Basic75
I don’t really see how this qualifies as evidence. First, I can’t replicate your results on my machine (as I wrote, running your script produces no output and exits immediately).
Other people have been able to replicate it. Let me provide a brief tutorial. In Terminal, type
which perl
you should have some version of Perl on any Mac.
Then use exactly my one-liner, e.g.:
time perl -e '$n = shift; $na = "a" x $n; print $na =~ /(a?){$n}a{$n}/' 25
There you go.
 
  • Like
Reactions: Basic75
Other people have been able to replicate it. Let me provide a brief tutorial. In Terminal, type
which perl
you should have some version of Perl on any Mac.
Then use exactly my one-liner, e.g.:
time perl -e '$n = shift; $na = "a" x $n; print $na =~ /(a?){$n}a{$n}/' 25
There you go.

Thanks, was able to replicate it now. I also modified the script to consistently produce output to make sure it actually works (for some reason print doesn't print anything on my machine, you have to save the result to a variable and print that variable for the result to appear).

Anyway, my statement still stands. You are running into some algorithmic scalability issue with the perl regex matcher on this particular pathological regex. One would need to profile the code to understand what's going on. Is is indeed puzzling that the results are practically identical for both machines, which leads me to suspect that there is some sort of external bottleneck (maybe memory allocation for backtracking, no idea, I don't know how perl's regex engine is implemented).
 
  • Like
Reactions: Basic75
I also modified the script to consistently produce output to make sure it actually works (for some reason print doesn't print anything on my machine, you have to save the result to a variable and print that variable for the result to appear).

Anyway, my statement still stands. ... Is is indeed puzzling that the results are practically identical for both machines, which leads me to suspect that there is some sort of external bottleneck (maybe memory allocation for backtracking, no idea, I don't know how perl's regex engine is implemented).
Yes. save the value and print. But it works whether the print does anything or not.

The whole point of the exercise is to deliberately force the worst case backtracking for the non-deterministic regex engine.
The puzzling fact that I thought I was pointing out is that they're indeed nearly identical!
 
Yes. save the value and print. But it works whether the print does anything or not.

The whole point of the exercise is to deliberately force the worst case backtracking for the non-deterministic regex engine.
The puzzling fact that I thought I was pointing out is that they're indeed nearly identical!

Oh, absolutely. The only issue I see here is that you are putting this discussion in the context of general-purpose performance of M1 vs Coffee Lake i9, where the tread is really about a weird corner case in the perl regex engine. It's not "MBP M1 Max vs. Core i9", it's "MBP M1 Max and Core i9 have same performance matching a pathological regex using perl".
 
  • Like
Reactions: Basic75
Good question! It matches. You can do $k = ($na =~ /...pattern/); print $k to verify. However, I think there's a (weird to me) interaction between list context for print and the result of =~ when there's a (..). Example: perl -e '($a, $b) = "AA" =~ /(A?)(A?)/; print "$a $b"'
Oh, I see. Thank you!

> The puzzling fact that I thought I was pointing out is that they're indeed nearly identical!

Yes, I agree. Most people that replied (including me) jumped on the regexp weirdness, but the point stays you found something to compute (as pathological as it might be) that takes the same time on both machines, scaling-up included and that's interesting.

-- Chris
 
  • Like
Reactions: chars1ub0w
It's not "MBP M1 Max vs. Core i9", it's "MBP M1 Max and Core i9 have same performance matching a pathological regex using perl".
Well, it's not just Perl. Let's look at the same regex matching using Python. Again, I give a one-liner that anyone can run from the Terminal. There is a small gap, but again the M1 Max isn't much different from a 4 year-old 2018 Intel Core i9 (i9-8950HK).
python comparison.png
 
Core i9 and even Core i7 that are 14nm or larger have history of thermal throttling.

So a M1 Max 5nm would be a better option.

I'd only go with a Intel Mac if you are using native Intel apps like Apple Aperture
 
  • Like
Reactions: james2538
Here's another comparison of Coffee Lake with M1 (specifically an M1 Max Mac Studio vs. a 2019 i9 iMac) for scientific computing. Here I tested a variety of single-threaded symbolic computation tasks in Mathematica, and found a wide range of differences in performance. For instance, look at the first two rows, which show the results for 35 different symbolic integrations. The M1 averages about 5% faster, with individual differences ranging from 0% to 61%.

There's at least three take-home messages:
1) Obviously, you want to look at a variety of tasks.
2) Performance deltas can vary widely even within the same class of task (e.g., symbolic integration). In these tests, integrations that had about the same run time on the M1 and iMac usually (but not always) were those whose solutions had lots of terms (100's to 1000's) (though this wasn't because of the time it took to display the solutions, since the outputs were suppressed).
2) It's not uncommon to find tasks that show no significant performance difference between the M1 and a late-model Intel Mac.

1666842772203.png
 
Last edited:
Core i9 and even Core i7 that are 14nm or larger have history of thermal throttling.

So a M1 Max 5nm would be a better option.

I'd only go with a Intel Mac if you are using native Intel apps like Apple Aperture
You're thinking of laptops. Thermal throttling does happen on the 27" Intel i9 iMac, but I've found it's modest.
 
Last edited:
  • Like
Reactions: AAPLGeek
Well, even if it the i9 laptop thermally throttles in my test, which it does since the test is sufficiently long, it still isn't slower than the M1 Max.
When I say "better" it isn't limited to the measure of "slower".

Consider

- performance per watt
- power consumption
- battery life
- thermals
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.