The Intel 5000X Chipset (the one used by the Mac Pro,) has two memory 'branches', with each branch having two channels.
The chipset is capable of operating with only a single module, in what is basically just a diagnostic mode. Normally, you need to install modules in pairs, filling one slot of each channel in a branch. (Other than just one module, you cannot have an odd number of modules.)
If you have a valid configuration with two memory modules in each branch, it will operate in quad-channel mode, even if the memory in each branch is mismatched.
This means that as long as you have two modules on each memory riser of the Mac Pro, it will operate in quad-channel mode. Because if you are using both memory risers, it
has to be using quad-channel mode. (And, similarly, as long as you have two valid modules in a riser, that riser is operating in dual-channel mode.)
For more information than you can possibly digest, check out Intel's
5000X Datasheet, the memory section starts on page 305.
I would suggest putting your existing 512 MB modules all on one riser, and the new modules on the other riser. Then, if you later get more 2 GB modules, divide up so you have two 2 GB modules and two 512 MB modules on each riser.
The only requirement for matching DIMMs is that each
pair has to be matched. You can have mismatched pairs on a single riser, as long as each pair matches itself, and you can have mismatched pairs on separate risers, as well as mismatched riser totals. As long as each pair of modules matches, it will operate just fine. (There will be a SLIGHT performance hit for having mismatched capacity risers, or having mismatched pairs on a single riser, but the increase from 2 GB to 6 GB total memory will effectively negate any such hit. Also, having more than four modules introduces some extra memory delay, but again, moving from 2 GB to 6 GB memory will be enough of an improvement to effectively negate the performance hit.)
edit: Oh, and if you read the Intel datasheet, you will see that the 5000X chipset is just insane in its memory capabilities. It can actually put memory in RAID-1 mode, so that the two branches are mirrored, to make your memory ridiculously error-proof. As if ECC wasn't good enough, I guess.