Anyone who believes that incrementing a counter for four seconds is a good real-world performance test simply doesn't know what they're doing.Oops a lot of critique. No problem. Right, what can I do? Perhaps something very simple. Simple regarding Rosetta2, which is given a bit too much credit in this thread. Fine, but I hesitate to agree 100%.
Testing the results from 2 different development systems for the Mac.
Results in app1 and app2. Both bumping a counter as fast as they can for 4 seconds within reasonable constraints. Similar interrupt masks. Just 1 thread. No GUI interaction. All in a terminal shell. The apps return the final amount of the counters after 4 seconds.
Originaly it was used to test code generation for Intel processors, between two similar but different dev systems. So two different code generators. Later it was used again to see the result wrt Rosetta2.
Since you didn't provide source code for this bad benchmark, I wrote my own based on your description.Higher number is better.
tested on 2020 M1 MBP Ventura 13.2 with Rosetta2
app1 76956
app2 5264808
tested on 2018 i7 Mini Ventura 13.2 (without Rosetta2)
app1 15988000
app2 16352180
Brilliant, quite a result for Rosetta2...
This is the real world, can't help it.
Sure, AS native versions would certainly give us superior results, but that was not what we tested or wanted to see.
Apple never said Rosetta2 was flawless. Certainly don't blame them. It's just that we create code which doesn't go well with Rosetta2.
C:
#include <stdio.h>
#include <sys/time.h>
#include <signal.h>
#include <stdlib.h>
unsigned long x;
void signal_handler(int sig) {
printf("%lU increments\n", x);
exit(0);
}
int main (int argc, char *argv[]) {
signal(SIGALRM, &signal_handler);
signal(SIGINT, &signal_handler);
struct itimerval fourseconds;
fourseconds.it_value.tv_sec = 4;
fourseconds.it_value.tv_usec = 0;
fourseconds.it_interval = fourseconds.it_value;
setitimer(ITIMER_REAL, &fourseconds, NULL);
x = 0;
while (1) x++;
return 0;
}
Save this code as 'badbench.c'. Compile with:
clang badbench.c -target arm64-apple-darwin -o badbench.arm64
clang badbench.c -target x86_64-apple-darwin -o badbench.x86_64
If anyone wants to take part, the only prerequisite is that you must have Xcode installed so that clang (Apple's C compiler) is there.
M4 running macOS Sequoia 15.5:
% ./badbench.arm64
16304148129 increments
% ./badbench.x86_64
15657018371 increments
2013 Retina MacBook Pro 15" i7-4850HQ running macOS Big Sur 11.7.10 (last version that can run on a 2013 without OCLP assistance):
% ./badbench.x86_64
2294131307 increments
M4's native and Rosetta results are identical within the margin of error from run to run. This is not a surprising result since a loop that does nothing but increment one integer variable should be an extremely easy thing for Rosetta to translate with no instruction count expansion.
The old Intel MBP is less than a factor of 2 slower in clock speed (2.3 GHz with a turbo limit of 3.5 GHz, vs M4's ~4.4 GHz), but it runs the benchmark at about 1/7 the speed. I doubt this ratio would stay the same in real programs, I'd expect the i7 to be closer, but this really is a bad artificial benchmark.
It's still not clear to me what your complaint about Rosetta is, and your attempt at criticizing its performance is very suspect as far as I'm concerned. Like @dmccloud says - if you're going to claim well known things about Rosetta are all wrong, post code we can run to independently verify what you say.