Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

ArkSingularity

macrumors 6502a
Original poster
Mar 5, 2022
928
1,130
This is very much one of those rabbit trails that was driven much more by sheer curiosity than actual utility, but I figured I'd check out QEMU and run a quick benchmark for anyone who was curious. Of course, the performance obviously is not great (it's quite literally emulating an x86 architecture in real time, and is not analogous to Rosetta's ahead of time translation). It's far from what I'd call even remotely usable for anyone who might need to run anything serious, but I wanted to see exactly HOW slow QEMU was at emulating x86 on the unbinned M2 Pro.

I ran Geekbench 5 on Ubuntu with 4GB of RAM, 3GB JIT cache and 12 cores. The results were:

- 112 single core
- 805 multicore
Full results on geekbench browser

(For comparison, geekbench 5 typically gets around ~1950 single core and ~15,000 multicore on bare metal. I would have run Geekbench 6 instead, but Geekbench 5 already took around half an hour in the VM, so running Geekbench 6 would have likely taken several times longer.)

Much to my surprise, I was actually able to play YouTube in a browser as well, with full software decoding and only a handful of dropped frames (granted, I didn't test past 360p, maybe I'll do a 720p test and see how much of a portable oven this thing turns into).

Is this a viable solution for anybody? Absolutely not (no surprise). But am I impressed that this thing managed to complete the benchmark at all (much less play youtube with software decoding)? Actually, I gotta give QEMU some credit on this one. For what it has to do to emulate an entirely different architecture in real time, it really didn't do quite as badly as I expected.
 
Last edited:
This is very much one of those rabbit trails that was driven much more by sheer curiosity than actual utility, but I figured I'd check out QEMU and run a quick benchmark for anyone who was curious. Of course, the performance obviously is not great (it's quite literally emulating an x86 architecture in real time, and is not analogous to Rosetta's ahead of time translation). It's far from what I'd call even remotely usable for anyone who might need to run anything serious, but I wanted to see exactly HOW slow QEMU was at emulating x86 on the unbinned M2 Pro.

I ran Geekbench 5 on Ubuntu with 4GB of RAM, 3GB JIT cache and 12 cores. The results were:

- 112 single core
- 805 multicore
Full results on geekbench browser

(For comparison, geekbench 5 typically gets around ~1950 single core and ~15,000 multicore on bare metal. I would have run Geekbench 6 instead, but Geekbench 5 already took around half an hour in the VM, so running Geekbench 6 would have likely taken several times longer.)

Much to my surprise, I was actually able to play YouTube in a browser as well, with full software decoding and only a handful of dropped frames (granted, I didn't test past 360p, maybe I'll do a 720p test and see how much of a portable oven this thing turns into).

Is this a viable solution for anybody? Absolutely not (no surprise). But am I impressed that this thing managed to complete the benchmark at all (much less play youtube with software decoding)? Actually, I gotta give QEMU some credit on this one. For what it has to do to emulate an entirely different architecture in real time, it really didn't do quite as badly as I expected.

I paid $200 for a 2015 iMac recently and the GB5 multicore is about 2,900 on it. It's a much cheaper way to run x86 if you need it. It came with 32 GB of RAM and Apple keyboard and mouse too.

I run a Windows 11 ARM UTM VM on my M1 Pro MacBook Pro and performance for x86 programs is decent, I think similar to an i7-10700. Of course I think that not all programs will run under Microsoft's version of Rosetta 2.

I'd love it if QEMU did a better job but emulation is just a lot slower than translation.
 
I paid $200 for a 2015 iMac recently and the GB5 multicore is about 2,900 on it. It's a much cheaper way to run x86 if you need it. It came with 32 GB of RAM and Apple keyboard and mouse too.

I run a Windows 11 ARM UTM VM on my M1 Pro MacBook Pro and performance for x86 programs is decent, I think similar to an i7-10700. Of course I think that not all programs will run under Microsoft's version of Rosetta 2.

I'd love it if QEMU did a better job but emulation is just a lot slower than translation.
Oh, of course. I would never use it over real x86 hardware. This was much more of a curiosity driven experiment to see how it stacks up with real x86 hardware (The answer is that it performs about like a Pentium 3 in single core, and about like a Core 2 Quad in multicore).
 
Last edited:
  • Like
Reactions: Basic75 and pshufd
I paid $200 for a 2015 iMac recently and the GB5 multicore is about 2,900 on it. It's a much cheaper way to run x86 if you need it. It came with 32 GB of RAM and Apple keyboard and mouse too.

I run a Windows 11 ARM UTM VM on my M1 Pro MacBook Pro and performance for x86 programs is decent, I think similar to an i7-10700. Of course I think that not all programs will run under Microsoft's version of Rosetta 2.

I'd love it if QEMU did a better job but emulation is just a lot slower than translation.
My favorite gimmick solution is similar, use a ‘09 iMac with target display mode, and remote into it with screen sharing.

I’ve also been experimenting with UTM running MacOS 9 and XP. It’s decent but for some reason I can’t get it to recognize.iso files as discs (on XP, it works on OS 9 for some bizarre reason).
 
This is very much one of those rabbit trails that was driven much more by sheer curiosity than actual utility, but I figured I'd check out QEMU and run a quick benchmark for anyone who was curious. Of course, the performance obviously is not great (it's quite literally emulating an x86 architecture in real time, and is not analogous to Rosetta's ahead of time translation). It's far from what I'd call even remotely usable for anyone who might need to run anything serious, but I wanted to see exactly HOW slow QEMU was at emulating x86 on the unbinned M2 Pro.

I ran Geekbench 5 on Ubuntu with 4GB of RAM, 3GB JIT cache and 12 cores. The results were:

- 112 single core
- 805 multicore
Full results on geekbench browser

How much real physical RAM is this virtual machine running inside of. Why so many virtual cores for just 4GB ?
If you fall over into any paging of the virtual machine infrastructure by the host , you are not going to directly measure just the virtual machine performance. Some 'old school' x86 system with just 4GB of RAM probably didn't have 12 cores back in the day. 4-6 years ago mainstream Intel stuff would have had just 4 cores in most 4GB systems.


P.S. from the UTM notes. The JIT cache is a little odd also.

" ...

JIT Cache​

This applies only for emulation. The JIT cache allows for translated code to be cached for faster execution. It is analogous to L2 cache in hardware. Although a larger cache size is usually better, there is diminishing returns when it is too large. The default value is 1/4 of the memory size configured above. The JIT cache is allocated separately from the guest memory and the size configured above does not include the JIT cache size.

..."

3GB is a much higher fraction of 4GB than just 25%. That is cranked so high almost trying to subsume the whole guest memory allocation into the cache.
 
Last edited:
  • Like
Reactions: ArkSingularity
How much real physical RAM is this virtual machine running inside of. Why so many virtual cores for just 4GB ?
If you fall over into any paging of the virtual machine infrastructure by the host , you are not going to directly measure just the virtual machine performance. Some 'old school' x86 system with just 4GB of RAM probably didn't have 12 cores back in the day. 4-6 years ago mainstream Intel stuff would have had just 4 cores in most 4GB systems.
It's running on a 16GB system. I actually assigned closer to 4.25GB, had to make sure I had enough spare memory for a generous JIT cache for QEMU as well. I was very careful to ensure that it wasn't paging out or having its performance skewed by memory compression (monitored page-ins and page-outs during the test as well). Everything else was completely closed out on the host system to ensure minimal CPU or RAM usage by host applications.

4GB of RAM is much more than an old school system would have had, but given that it was running a modern Ubuntu x86_64 OS and a full Gnome 3 desktop, I wanted to make sure it had enough to run the test. RAM usage peaked at around 2GB within the VM, so there was plenty of memory to spare.

As for why it was run with 12 cores, I wanted to get a sense of the absolute full potential that the M2 pro had when emulating x86 inside of QEMU. I actually ran another test with just 8 cores as well, and it scored around ~700 multicore (single core scores were the same). This whole test was much more an experiment to satisfy curiosity, the actual utility of this was, unsurprisingly, not necessarily very viable for any serious use case.

Believe it or not, it was not completely unusable either though (and this was much to my surprise, I would have considered it a success even if all it had managed to do was boot in under half an hour). Would it be pleasant to use for anything serious? Absolutely not (even opening Libreoffice took several seconds longer than usual, and YouTube at 360p managed to run the SOC up to 102C). But alas, it exceeded my expectations and was a fun little experiment.

(Edit: I also ran this test on a base model 13" M1 MBP (8GB of RAM) and the performance numbers were considerably worse. It scored around 71 in single core and ~270 in multicore with 3GB of RAM, 1GB JIT cache, and 4 cores. This was likely much more skewed by running it on a host system that only had 8GB of RAM, so I don't think it was quite as fair of a benchmark as the M2 Pro was.)
 
Last edited:
3GB is a much higher fraction of 4GB than just 25%. That is cranked so high almost trying to subsume the whole guest memory allocation into the cache.
Responding to your edit. The JIT cache is in addition to the memory assigned to the VM, and does not take away from the 4GB that was assigned in the manner for which I configured it (the total JIT + RAM combined was about 7.25GB). It was most certainly excessive for the use case I was testing, but I had the memory to spare so I figured I might as well (specifically wanted to test the CPU performance, wasn't as concerned about the RAM since I was careful to ensure that the system wasn't paging in/out during the test).

I might run some more tests down the road and share just to see how it performs on different cache sizes.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.