I want everyone to take a look at this article showing the differences of two identical systems with a 7800GT with 256 in one and with 512 in the other.
http://www.pureoverclock.com/article33.html
First off, we need to see the difference between the 7800 GT and the 8600M GT. The 7800GT has a core clock relatively similar to the later released 8600M (note I am not using the 9600M as reliable performance numbers are not available at this time, take note that Apple also UNDERCLOCKED the 8600M, so if they do the same with the 9600M GT you will get even less performance then reported.) but it also has the ever important 256 bit bus width.
Here are your results:
And yes, I did leave out the charts that didn't support my point.

And remember, those WITH performance gains show so because of the bus width, IMO. It's just that this was the best I can find to directly compare two identical systems with only the card beings swapped out.
Even with those advantages, even WITH the 256 bit bus width, you only see gains up around 2048x1536 with max settings, and the framerate on most games was not playable anyway. To top it off, the article notes that even with a massive 100MB dump to the system memory, performance didn't take a noticeable hit till afterward. What does that tell me?
1) Even with the 256-bit interface, clock speed is a major factor when it comes to VRAM utilization.
2) Dumping to system memory isn't going to ruin your numbers until you get above 100MB.
3) The TYPE of memory used (512MB of GDDR2 vs 256 of GDDR3) has an even bigger effect on the card.
4) In order to see the benefits, your other hardware (CPU, RAM, etc) must exceed what a notebook can provide (without bursting into flames

)
With that in mind, I can honestly tell you that the extra RAM in the 9600 will do next to nothing thanks to the bus width and the clock speed. (This is slightly dependent on Apple's habit of underclocking but I'd stand to say you wouldn't see a boost even if they didn't.) To answer you directly: Unless you are looking at a higher clocked card coupled with a fast CPU and good RAM, the extra memory will give you next to nothing (<1%). It simply shifts the bottleneck.