That just standard boilerplate for a new version of a benchmark. What we're doing, comparing score ratios is fine. What they asking people not do to is the thing that should be obvious not to do (but you know people ...) which is directly compare the 14700K score in CB26 with its score in CB24 - like wondering why it scored 10,000 in one and 3,000 in the other. Obviously with a different baseline and possibly engine changes the scores are indeed not directly comparable. Comparing the rank or ratio of the 14700K scores to another chip's like the 14900K in one benchmark to another benchmark, regardless of the relationship between the two benchmarks, is, again, totally fine.
I did above - first glance AMD looks similar to Intel (the 9950X3D got basically the same score as the 285K in both benchmarks so both newer x86 chips gained relative to M4/M5 chips and stayed static relative to M3s). But I haven't delved deeply into AMD's scores to confirm if older AMD chips catch up to newer ones. It's possible it's compiler differences, it's possible that the way the scene is being rendered has changed (more rays), etc ... the difference between new x86 chips and new Apple Silicon chips isn't huge <10%, so nothing to be terribly bothered by if that was the only difference.
EDIT: while newer AMD chips appear to maintain their relative scores to newer Intel chips between the two benchmarks, at first glance older AMD chips do not appear to catch up to newer ones in CB26 relative to CB24:
AMD Ryzen 9 5950X vs AMD Ryzen 9 9950X3D – Benchmarks, Specifications & Comparison. Which CPU is faster, more efficient, and better for gaming & productivity?
www.cpu-monkey.com
As far as I can tell, no, these are all just user generated benchmarks (well I'm assuming so for CPU Monkey but I don't actually where they get their data) with all the variance that entails. Given how extreme the 14700K differences are, something appears to be off. I would be cautious to assign fault to the benchmark itself without more data.
The 14700K results are the only ones I've seen that egregiously different and as
@diamond.g said, we'd need to make sure the original results and the new ones are being run on the exact same systems - that people weren't quoting results for an overclocked model vs an stock/underclocked model, etc ... That's the problem with user generated benchmarks, unless like 3DMark's website they let you filter by the full range of specs (and even then it can be dicey simply because sometimes people benchmark because their system isn't working right). I mean on the flip side the 14900K in CPU Monkey gains like 10% on the 285K, but looking at other user generated data from computerbase, the possible range of 14900K results is ginormous.
That's why we would need a reviewer with the exact same system running both sets to make sure the 14700K results we're seeing aren't aberrant.
Overall though I'd be less bothered by changes of ~10% individually as heck even running the same benchmark on the exact same system can yield a few percent difference. So a change in the way the benchmark runs resulting in about 10% differences in relative scores would not be in and of itself be that unusual. It's the fact that there are
at least 2 such changes that push older x86 CPUs to be more like 20% more performant relative to newer Apple Silicon chips that makes whatever changed a little more ... interesting. But that's just my threshold for curiosity, others may have a different one! However, again wrt the x86 chips only your test on the older Intel chip is, as far as I know, being run on the exact same system. So the variation there has to be taken with a big grain of salt. We need more people to upload that kind of data where we
know the results from the two benchmarks came from the same system.