I checked with 9600GT and 8800GT which are pretty much the same hardware. I don't have an ATI based mac to test though.
For what it is worth; here are the details for my MacBook System:
Platform: Mac OS X x86 (32-bit)
Compiler: GCC 4.0.1 (Apple Inc. build 5490)
Operating System: Mac OS X 10.6 (Build 10A432)
Model: MacBook (Early 2008)
Motherboard: Apple Inc. Mac-F22788A9 PVT
Processor: Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz
Processor ID: GenuineIntel Family 6 Model 23 Stepping 6
Logical Processors: 2
Physical Processors: 1
Processor Frequency: 2.40 GHz
L1 Instruction Cache: 32.0 KB
L1 Data Cache: 32.0 KB
L2 Cache: 3.00 MB
L3 Cache: 0.00 B
Bus Frequency: 800 MHz
Memory: 4.00 GB
Memory Type: 667 MHz DDR2 SDRAM
SIMD: 1
BIOS: Apple Inc. MB41.88Z.00C1.B00.0802091535
Processor Model: Intel Core 2 Duo T8300
Processor Cores: 2
Integer (Score: 2688)
Blowfish single-threaded scalar -- 1669, , 73.3 MB/sec
Blowfish multi-threaded scalar -- 3390, , 139.0 MB/sec
Text Compress single-threaded scalar -- 1819, , 5.82 MB/sec
Text Compress multi-threaded scalar -- 3389, , 11.1 MB/sec
Text Decompress single-threaded scalar -- 1623, , 6.67 MB/sec
Text Decompress multi-threaded scalar -- 3259, , 13.0 MB/sec
Image Compress single-threaded scalar -- 1706, , 14.1 Mpixels/sec
Image Compress multi-threaded scalar -- 3143, , 26.4 Mpixels/sec
Image Decompress single-threaded scalar -- 1496, , 25.1 Mpixels/sec
Image Decompress multi-threaded scalar -- 2865, , 46.7 Mpixels/sec
Lua single-threaded scalar -- 2669, , 1.03 Mnodes/sec
Lua multi-threaded scalar -- 5237, , 2.01 Mnodes/sec
Floating Point (Score: 4862)
Mandelbrot single-threaded scalar -- 1777, , 1.18 Gflops
Mandelbrot multi-threaded scalar -- 3513, , 2.30 Gflops
Dot Product single-threaded scalar -- 3214, , 1.55 Gflops
Dot Product multi-threaded scalar -- 6621, , 3.02 Gflops
Dot Product single-threaded vector -- 2557, , 3.06 Gflops
Dot Product multi-threaded vector -- 5737, , 5.97 Gflops
LU Decomposition single-threaded scalar -- 691, , 615.5 Mflops
LU Decomposition multi-threaded scalar -- 1370, , 1.20 Gflops
Primality Test single-threaded scalar -- 3834, , 572.6 Mflops
Primality Test multi-threaded scalar -- 5821, , 1.08 Gflops
Sharpen Image single-threaded scalar -- 4959, , 11.6 Mpixels/sec
Sharpen Image multi-threaded scalar -- 9668, , 22.3 Mpixels/sec
Blur Image single-threaded scalar -- 6279, , 4.97 Mpixels/sec
Blur Image multi-threaded scalar -- 12029, , 9.46 Mpixels/sec
Memory (Score: 2128)
Read Sequential single-threaded scalar -- 2987, , 3.66 GB/sec
Write Sequential single-threaded scalar -- 1873, , 1.28 GB/sec
Stdlib Allocate single-threaded scalar -- 1948, , 7.27 Mallocs/sec
Stdlib Write single-threaded scalar -- 1851, , 3.83 GB/sec
Stdlib Copy single-threaded scalar -- 1981, , 2.04 GB/sec
Stream (Score: 1738)
Stream Copy single-threaded scalar -- 1704, , 2.33 GB/sec
Stream Copy single-threaded vector -- 1831, , 2.38 GB/sec
Stream Scale single-threaded scalar -- 1771, , 2.30 GB/sec
Stream Scale single-threaded vector -- 1838, , 2.48 GB/sec
Stream Add single-threaded scalar -- 1528, , 2.31 GB/sec
Stream Add single-threaded vector -- 1960, , 2.73 GB/sec
Stream Triad single-threaded scalar -- 1790, , 2.47 GB/sec
Stream Triad single-threaded vector -- 1489, , 2.79 GB/sec
I would say that the issues relate to the video drivers rather than the operating system itself given the numbers who seem to be happy and have GMA 950, GMA X3100 or ATI based Mac's.