Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

theorist9

macrumors 601
Original poster
Suppose I open a folder of, say, 50–200 photos in Preview (see screenshot as an example), and wish to quickly down-arrow through the thumbnails to find a specific picture.

I'm currently using a 16" M1 Pro MacBook Pro, and I find it's far too slow to do that efficiently. Specifically, I can comfortably view those at a rate of ≈4/second, but my M1 can't display them much faster than ≈1/second.

I'm wondering what specific piece of hardware (or combination of hardware) is bottlenecking this process. Here are the possibilities that come to mind:

1) GPU.

2) Display engine.

3) CPU.

SSD speed isn't a factor, since when I opened this entire folder (which Finder says is 440 MB), Activity Monitor said that Preview's RAM usage increased from 13 MB to ~400 MB, indicating they were loaded into RAM. And RAM speed certainly shouldn't be a factor, given that each of these pictures is only 2-6 MB.

If it is, for instance, the GPU, that would be useful to know, since it means when I buy my next Mac (likely an M5 Max Studio), that will tell me I'll get better performance for this and similar tasks by getting the 40-core GPU rather than the 32-core. [Whether I decide the difference in price is worth it is another matter.]

1775929630384.png
 
Last edited:
There are two factors that come to my mind:

- Filesystem/SSD — although you mention that the memory is allocated, it does not actually mean that the data has been read. It is likely that images are processed as memory-mapped files, where the virtual memory region is reserved by the system, and the data is read lazily, on demand

- Downsampling — while generating thumbnails is a straightforward operation, it does require the entire image to be read in and downsampled; and if you want high-quality thumbnails you might want to downsample recursively or use more expensive filters. GPUs are very good at image processing of this type, but if you are streaming images from teh disk to create thumbnails, the latency will such as there is just not enough parallelism to make good use of the GPU. Now, there are ways to mitigate it (e.g. with caching or by exploiting the image file format).
 
I generated a bunch of HEICs and opened them all up in Preview, then attached instruments to it while i held the down arrow to scroll through the entire list. (I'm guessing this is where you're seeing the lag?). Now, I did do this on my M4 Pro, and it was able to do more than 1 image/s, probably more than 4 even.

With the Instruments run, I see that each image seems like it takes 80ms or so, about 12.5 images per second. Most of that time is spent, in order of time spent:
  • Scaling the image, done by either the GPU or the video decoder, not completely sure. (VideoToolbox has its own ability to scale video.) Also, transforming the image's channels so that it can be displayed, also accelerated.
  • Memory copying
  • Updating views
  • Decoding the image, done by the chip's video decoding engine (HEIC is just a single frame of an HEVC/H.265 video)
So, probably a combination of memory bandwidth, GPU speed, and CPU speed.
And RAM speed certainly shouldn't be a factor, given that each of these pictures is only 2-6 MB.
They are quite a bit larger when decompressed, RGBA (32-bits per pixel) * pixel height * pixel width = total size per image, copied multiple times.

Edit: additional thought, the M4 may have more advanced scaling and decoding hardware, I know the M4 is the first to have newer vector instructions, which may be useful for scaling. Of course Apple does not really talk about this much.
 
Last edited:
  • Like
Reactions: theorist9 and leman
There are two factors that come to my mind:

- Filesystem/SSD — although you mention that the memory is allocated, it does not actually mean that the data has been read. It is likely that images are processed as memory-mapped files, where the virtual memory region is reserved by the system, and the data is read lazily, on demand.
To test this, I switched to my 2019 iMac, which has a similar issue, and has a RAM disk app configured. The iMac has 128 GB of RAM, and IIRC I allocated 64 GB to the RAM disk--far more than is needed to hold the entire picture folder.

I copied the picture folder over to the RAM disk, and opened it from there. There was no signifcant difference in speed. I believe that should confirm that SSD access speed isn't the issue.

- Downsampling — while generating thumbnails is a straightforward operation...
Just to clarify, I'm measuring the time it takes to open the main pic (to the right of the sidebar), rather than the thumbnails in the sidebar.
 
I generated a bunch of HEICs and opened them all up in Preview, then attached instruments to it while i held the down arrow to scroll through the entire list. (I'm guessing this is where you're seeing the lag?). Now, I did do this on my M4 Pro, and it was able to do more than 1 image/s, probably more than 4 even.

With the Instruments run, I see that each image seems like it takes 80ms or so, about 12.5 images per second. Most of that time is spent, in order of time spent:
  • Scaling the image, done by either the GPU or the video decoder, not completely sure. (VideoToolbox has its own ability to scale video.) Also, transforming the image's channels so that it can be displayed, also accelerated.
  • Memory copying
  • Updating views
  • Decoding the image, done by the chip's video decoding engine (HEIC is just a single frame of an HEVC/H.265 video)
So, probably a combination of memory bandwidth, GPU speed, and CPU speed.

They are quite a bit larger when decompressed, RGBA (32-bits per pixel) * pixel height * pixel width = total size per image, copied multiple times.

Edit: additional thought, the M4 may have more advanced scaling and decoding hardware, I know the M4 is the first to have newer vector instructions, which may be useful for scaling. Of course Apple does not really talk about this much.
On the M1 Pro MBP, I compared a folder of HEIC+MOV files (taken on an iPhone; as you likely know, even when taking a picture, iPhone sometimes creates MOV files instead of stills; the feature is called "Live", and I haven't yet bothered to figure out how to turn it off) with a folder of JPEGs (taken on a compact digital camera).

When opening the HEIC/MOV folder, VTDecoder ran at about 150% CPU, and Preview at about 70%. The pics in the JPEG folder displayed twice as fast, likely because they didn't need VTDecoder to convert them to stills. And, without the VTDecoder bottleneck, Preview was able to operate at about 160%.


And this illustrates the effect of image scaling: On my iMac, I compared display rates of the JPEGs when opening at full screen on the iMac's 5k display vs. at full screen on an WUXGA external monitor (1200 x 1980 = 28% of the pixels that are on the 5k). They displayed ≈50% faster on the WUXGA, with Preview averaging ≈270%, vs. ≈170% on the 5k. So there's something other than Preview's CPU time that is limiting the display speed on that computer.
 
Last edited:
Very interesting results, a many things that surprised me. In particular, I would not have expected that the display resolution would play such a significant difference. Is that also the case if you connect the WUXGA display to your M1 laptop?

I must say, I don't have a reasonable explanation for what you are reporting. Could be a particularity of an algorithm, deliberate throttling to reduce power consumption, or something else weird. My gut feeling is to resist the notion of a hardware bottleneck (beyond decoding).
 
Very interesting results, a many things that surprised me. In particular, I would not have expected that the display resolution would play such a significant difference. Is that also the case if you connect the WUXGA display to your M1 laptop?

I must say, I don't have a reasonable explanation for what you are reporting. Could be a particularity of an algorithm, deliberate throttling to reduce power consumption, or something else weird. My gut feeling is to resist the notion of a hardware bottleneck (beyond decoding).
Yeah, the display effect was interesting. I can't use my 5k iMac as an external (damn you, Apple!). But, while the pixel count differential isn't as great, I was able to compare the M1 Pro MBP connected to a WUXGA external vs. a 4k external.

With that, a time difference to scroll through 235 JPEGs was there, but marginal: 10% faster with the WUXGA (27 s vs. 30 s; I repeated each to ensure the times were consistent). [I chose JPEGs to eliminate heavy decoding as a variable, which it would have been with the HEIC/MOV files.]

However, visually, it seemed more than just 10% faster on the WUXGA. So I'm wondering if it's skipping images (which I've noticed it does on my iMac) when displaying on the 4k. I'd need to create about 50 JPEGs whose images were just consecutive numbers (1, 2, 3...) (one huge-font number/image) to determine this, which is more work than I'm willing to do 😉.
 
Last edited:
The software plays its part too. I tried opening with Preview a folder containing a few hundred images (mixed jpg and raw files) and scrolling was close to hopeless.
Opened the same folder with FastRaw Viewer and everything was instant, no issues with downscaling, etc.
Used M1 Max 16" MBP.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.