Thank you for these...I have a question though. It seems like per core, the Intel C2D is much faster than the G5 Quad, why exactly?
There are many many reasons why this. I'll list a sampling of them, but there's hundreds (likely many thousands) of small reasons.
* Manufacturing process: the transistors
and wires in the Core 2 Duo are smaller (due to being made on newer manufacturing equipment), allowing reduced power usage and/or increased speed. Intel's also made some innovations with the materials used recently (google "high-k dielectric").
* Load/store reordering: the C2D has a lot of flexibility in what order it loads data from memory, allowing it to reduce the amount it has to wait. It can even do things like start loading some data ahead of a store, then cancel and restart the load if it turns out that the store was to the same address (overwriting the data that would have been loaded).
* The G5's instruction issue limitations: The G5 is very picky about how instructions are grouped, and programs that don't take this into account sometimes end up missing out on a good bit of its speed.
* Bigger caches: another benefit of smaller transistors is just that you can use the extra space freed up by them to add things like gigantic caches. Intel also seems to be able to design caches that are both very fast and large. I don't know how they manage that.
* The G5's rather high memory latency: While the G5 had tons of memory bandwidth, its latency was pretty awful.
* Branch prediction: C2D has a very advanced branch predictor including a loop end predictor and assorted other clever ideas
* The G5's slow integer units: Even the simplest integer operations take two cycles to complete on a G5, making it difficult to schedule some code for optimum throughput (you have to interleave dependent integer math with other operations).
* Memory prefetching: Both chips do pattern recognition in order to predict loads, but the C2D implementation of it is apparently extremely good.
Overall, my expectation would be:
* The C2D will absolutely destroy the G5 for tasks like compiling, or running javascript. These are memory latency sensitive, integer/branch sensitive, and use caches well (strong temporal locality).
* The G5 will compete well on things that emphasize bandwidth over latency, straight-line code over branches, and floating point math over integer. In particular, vector code (altivec for the G5, sse 1-4 for the C2D) should be competitive.
* Most day to day applications either don't use the CPU heavily (common) or fall somewhere in between these extremes, but significantly more towards what the C2D is good at.
Sadly for the G5, most of the things it's good at happen to also be the things GPUs are good at. The things it's bad at tend to be the day to day tasks of using a computer. Not unexpected for a cut-down server chip I suppose.