More analogies, anyone?
Imagine going to an amusement park with a stunning rollar coaster ride that was getting rave reviews. You stand in the queue line, get on the coaster, zip through, have fun, and come out exhausted. The amusement park realizes that the roller coaster often times gets so overcrowded that they could open a duplicate roller coaster to get the extras, and keep people pouring in. So they do. Larger numbers of people can be handled at once, yet it still takes just as long for a roller coaster ride to happen. So it can do more because it has a larger "width" (that refers to how many whatevers you can cram through whatever it is you are measuring).
A dual G4 is quite similiar to the amusement park. Now, there are certain things that can't be run simultaneously (say you want to add five to a number, and then multiply the result by 6... you can't do the multiply and add at the same time because the computer won't know what to multiply by 6 until the add is done!) due to data dependencies (the add and multiply example) and other things like that. The "people" in the "roller coaster" (processor) are the equivalent of "threads". Not quite exactly, but pretty close.
Now, to get the full picture, add a guy at the front of the queue lines that will send you to one line or the other to keep the lines as close to the same length as possible. This will keep the roller coasters running will closer to equal sized loads, so that nobody has to wait as long for the coaster to be ready again.
That's what MacOS X does. And, from what I understand, it does a very good job at it. This allows a dual processor to get much higher performance (in some areas equal to almost 150% - sorry about the "200%", that was a typo!) then a single processor system (in certain areas... that comment was not meant to mean in all areas... sorry about that!). You won't get a 2GHz system from having two 1GHz processors, you'll get something better: higher FLOPS (which is what is the REAL indication of power... my Athlon 1.2 running, in Linux, a no-hands-held number crunching routine I wrote came up with... 400 MEGAflops! Now the 11.8 Gigaflops advertised by Apple is not with MacOS X running, but I imagine it is still much better then my Athlon's rating), but that inof itself is another thread.
(Windows NT (and XP, which is just NT 5.1 with an ugly face and name change on the outside) fills up one processor to full capacity and then goes on to the next processor. Which really hurts dual processor systems that choose to foolishly use Windows instead of *nix.)
Anyway, I hope my rant was helpful on the performance issue.