Mac OS very much does this on purpose. Part of it is because of the page cache and things of that sort, which have a lot more than just speculatively loaded cached files from the filesystem that
might be needed in the future. The page cache (not sure if that's the right technical term on MacOS) has pages that contain code from binaries that are being executed, libraries that are being used, and all sorts of other data that is actively needed by applications that are executing also.
When you, say, load up Chrome, it's not going to just straight up load the entire Chromium binary into RAM and execute it. Instead, it loads up a tiny portion of it to bootstrap the execution, then loads additional pages as they are actually needed. This prevents it from having to waste a ton of RAM loading pages that contain code that might not need to be executed right now. (Technically, when it does load pages, it loads several at a time. This reduces how often it has to fetch, so there are still some pages that are loaded in a read-ahead manner, but it's not the entire binary).
(Some of this, by the way, very much get loaded into memory that
isn't always labeled as cached on Activity monitor. I'm unsure why MacOS does this differently than Windows or Linux, but this indeed appears to be the case from the testing I've performed.)
Anyway, what MacOS does here is really smart: It will keep these pages in memory a lot longer than Windows or Linux might, and is generally much more hesitant to purge them before it actually needs to. The reason that MacOS does this (as I found out recently) was because it's actually quite a lot faster just to compress these pages and to decompress them when they're needed than it is to straight up purge them and have to refetch them from the disk completely.
This is why MacOS RAM requirements tend to balloon on machines with more RAM. A lot of it is this sort of memory that MacOS can take advantage of to reduce disk IO for applications that are running. It does have an impact on performance, but there are diminishing returns to some extent. Just because a workload uses 13GB on a 16GB system doesn't mean it won't run just fine on an 8GB system also.
I actually
ran a test recently to see how far this could be pushed on systems with less than 8GB. It isn't a perfect test (had to use a VM, and I also ran decidedly light workloads, namely around 5 chrome tabs or so with a handful of other light applications like the app store and the calendar open at once. But the results were better than expected.)