PDA

View Full Version : OpenCL Benchmarks and Support for Both MacBook Pro GPUs




MacRumors
Aug 30, 2009, 01:20 PM
http://www.macrumors.com/images/macrumorsthreadlogo.gif (http://www.macrumors.com/2009/08/30/opencl-benchmarks-and-support-for-both-macbook-pro-gpus/)

With the release of Mac OS X Snow Leopard, we're getting our first look at the possibilities behind Snow Leopard's OpenCL technology. OpenCL (http://en.wikipedia.org/wiki/OpenCL) is a framework that allows applications to more easily harness the power of multiple GPUs and CPUs found in your computer. This would allow powerful graphics cards (GPUs) to do more general processing and could improve application performance substantially.

Unfortunately, there have been few apps that have been released that properly demonstrate the potential of OpenCL. Forum user J the Ninja (http://forums.macrumors.com/showthread.php?t=775848), however, points to a recently released OpenCL Benchmark application (http://www.macupdate.com/info.php/id/32266/opencl-benchmark) that tests the speed of the various OpenCL capable devices in your Mac. This includes both CPUs and GPUs. The current list of OpenCL supported GPUs include:

- NVIDIA GeForce 9400M, GeForce 9600M GT, GeForce 8600M GT, GeForce GT 120, GeForce GT 130, GeForce GTX 285, GeForce 8800 GT, GeForce 8800 GS, Quadro FX 4800, Quadro FX5600
- ATI Radeon 4850, Radeon 4870

The benchmark runs on each device showing the relative performance. Most interesting is that for owners of high end MacBook Pros which contain both 9400M and 9600M GT graphics cards, both GPUs can be used at any time by OpenCL. In contrast, both of these GPUs (http://www.macrumors.com/2008/10/16/macbook-pro-does-not-support-both-gpus-simultaneously/) can not be used for general graphics processing and requires a Mac OS X logout to switch from one to another.

In this particular example, the benchmark performance of the user's MacBook Pro CPU and two discrete GPUs were as follows (smaller numbers faster):

GeForce 9600M GT: 2.805 seconds
GeForce 9400M: 3.081 seconds
Intel Core 2 Duo @ 2.40GHz: 15.459 seconds

Combining all three processors at once could theoretically deliver substantial performance improvements to the right application.

Finally, another floating point benchmark application called AO Bench (http://lucille.atso-net.jp/aobench/) has also been ported (http://kioku.sys-k.net/archives/2009/08/opencl_ao_bench.html) to OpenCL and can also show the difference between CPU and GPU performance in some configurations.

Article Link: OpenCL Benchmarks and Support for Both MacBook Pro GPUs (http://www.macrumors.com/2009/08/30/opencl-benchmarks-and-support-for-both-macbook-pro-gpus/)



Creibold
Aug 30, 2009, 01:25 PM
My Results, for comparison:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.929 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7800 @ 2.60GHz
Device 1 is an: CPU with max. 2600 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.840 seconds

flopticalcube
Aug 30, 2009, 01:29 PM
Booo! More ATI GPUs please. Yes, I'm talking to you, AMD.

PurrBall
Aug 30, 2009, 01:32 PM
Desktop hardware seems to get about 1 second difference vs. MBPs:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.034 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU E8335 @ 2.93GHz
Device 1 is an: CPU with max. 2930 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.820 seconds

markm49uk
Aug 30, 2009, 01:35 PM
Booo! More ATI GPUs please. Yes, I'm talking to you, AMD.

I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

2002cbr600f4i
Aug 30, 2009, 01:36 PM
Ok,this was just run on my 2009 Mac Pro, 2.93 Ghz Quad w/HT ON + 4870 GPU:

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.244 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3540 @ 2.93GHz
Device 1 is an: CPU with max. 2925 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.834 seconds

And from my old Macbook 2,1:

OpenCL Device # 0 = Intel(R) Core(TM)2 CPU T7400 @ 2.16 GHz
Device 0 is an: CPU with max. 2160 MHz and 2 units/cores
Now computing - please be patient....
time used: 17.149 seconds


Interesting that the CPU beats the 4870 on the Pro!

I also tried the AOBench one.... Unfortunately when I try to run it against the GPU, it gives the following error:



CL_DEVICE_NAME: Radeon HD 4870
CL_DEVICE_VENDOR: AMD
Error: Failed to build program executable
cvmsErrorCompilerFailure: LLVM compiler has failed to compile a function.
logout


On the plus side, they include the source for the AOBench stuff, and the XCode project, so you can fiddle with it and see how OpenCL code is written.

THX1139
Aug 30, 2009, 01:36 PM
Does this mean that if I flash a Radeon 4870 to run in my 2006 MacPro, it will have OpenCL? If anyone who has done this, please run a test to see if it works.

freiheit
Aug 30, 2009, 01:37 PM
Booo! More ATI GPUs please. Yes, I'm talking to you, AMD.

You need to be talking to Apple, not AMD. The Radeon HD 2000 series had the hardware necessary for this kind of stuff. As did the 3000 series. Apple have decided not to support them for one reason or another.

Shawn Parr
Aug 30, 2009, 01:39 PM
My 2008 unibody MBP with 2 GPUs only uses one at a time it appears:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.497 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
Device 1 is an: CPU with max. 2530 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.734 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


I wonder if this is like the 6GB RAM limit on these machines? The 9400M is the one I happen to be switched to using for graphics. When I first heard about OpenCL I always kind of assumed it would be more likely to use the one you weren't using for graphics at the time. Silly assumption in hindsight.

2002cbr600f4i
Aug 30, 2009, 01:43 PM
You need to be talking to Apple, not AMD. The Radeon HD 2000 series had the hardware necessary for this kind of stuff. As did the 3000 series. Apple have decided not to support them for one reason or another.

If you look on the compatibility page on ATI's site, you'll see in the footnotes that the 2600 series does not support double precision floating point operations...

Now if you're Apple, and you're encouraging people to use this technology, are you going to potentially support something that isn't going to give the exactly same results for calculations that you'd get from the CPU? - NO.

For example:
(dp FP vs sp FP)
4.546677E10 != 4.566E10

If you're doing scientific calculations, and you're expecting double precision and the app is giving you back single precision because you ran it on the GPU instead of the CPU, you're going to be pissed.

doctoree
Aug 30, 2009, 01:44 PM
Wow, this is incredible.
This means small, inexpensive Laptops with SL installed can suddenly beat big,fat MacPros without SL.

AidenShaw
Aug 30, 2009, 01:44 PM
Unfortunately, there have been few apps that have been released that properly demonstrate the potential of OpenCL.

OpenCL will only become interesting when you can actually use it in shipping applications - synthetic benchmarks that show "the potential" of OpenCL are just a tease.

They'll come, but it will take time for vendors to recode their apps to use OpenCL.

It also doesn't help that Apple didn't support OpenCL on more GPUs - with the relatively few OpenCL capable systems, there's less incentive for vendors to port.


...the 2600 series does not support double precision floating point operations...

If you're doing scientific calculations, and you're expecting double precision and the app is giving you back single precision because you ran it on the GPU instead of the CPU, you're going to be pissed.

Also note that some earlier GPUs didn't do full IEEE floating point support (support for NaNs, ±Infinity) , and didn't do accurate floating point. (A small rounding error would not be noticeable on a pixel in a display, but would kill a compute job.)

2002cbr600f4i
Aug 30, 2009, 01:45 PM
My 2008 unibody MBP with 2 GPUs only uses one at a time it appears:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.497 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
Device 1 is an: CPU with max. 2530 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.734 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


I wonder if this is like the 6GB RAM limit on these machines? The 9400M is the one I happen to be switched to using for graphics. When I first heard about OpenCL I always kind of assumed it would be more likely to use the one you weren't using for graphics at the time. Silly assumption in hindsight.

Did you not even bother to read the article? It stated clearly that only 1 GPU is active at a time. With the unibody MBP's only 1 GPU can be accessed at a time, and you have to log out to switch between them. I don't get why people think that wouldn't be the case with OpenCL. The OS only sees 1 of the GPUs at a time.

jlasoon
Aug 30, 2009, 01:45 PM
2009 MacPro 2.26

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1400 MHz and 32 units/cores
Now computing - please be patient....
time used: 1.589 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
Device 1 is an: CPU with max. 2260 MHz and 16 units/cores
Now computing - please be patient....
time used: 1.161 seconds

TheIguana
Aug 30, 2009, 01:46 PM
My 2008 unibody MBP with 2 GPUs only uses one at a time it appears:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.791 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.009 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.261 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

I just ran that with my 2008 unibody MBP. Granted I was running with 9600m and not the 9400m, but in that mode it did run the benchmark on both GPUs.

Update: I just switched over the 9400m on my 2008 MBP and tried it again. As you can see it only ran the test on the 9400m and not the 9600m. So it looks like OpenCL only works with both GPUs on the 2008 models when you are running the 9600m GPU.
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.555 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.620 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

HLdan
Aug 30, 2009, 01:46 PM
I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

Agreed, thank you. My iMac isn't that old to not be support by Open CL.

Aaargh!
Aug 30, 2009, 01:49 PM
These results are totally useless without knowing what exactly it is that is benchmarked.

damieng
Aug 30, 2009, 01:49 PM
This is interesting - my 2007 MacBook Pro 17" 2.6GHz (MacBookPro3,1) produces the following results:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.368 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7800 @ 2.60GHz
Device 1 is an: CPU with max. 2600 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.080 seconds

Which seems to be better performance than some of the newer similar-spec models. Perhaps caused by the 32 unit/cores on the 8600 vs the 9400?

[)amien

kresh
Aug 30, 2009, 01:51 PM
Wow, this is incredible.
This means small, inexpensive Laptops with SL installed can suddenly beat big,fat MacPros without SL.

The most intensive task that I do is video conversion and dvd ripping. It appears that a Mac Mini w/9400 will soon outperform my iMac w/2600HD on these tasks. I am seriously considering putting my iMac on eBay and building a Hackintosh (that I can upgrade) plus a quality monitor, and as a side benefit put some cash back into my pocket.

I used to support the all-in-ones and poo-poo the naysayers when they complained about not being able to upgrade video cards. Now that this has bit me on my butt I see their point. I will never purchase a desktop all-in-one again.

THX1139
Aug 30, 2009, 01:53 PM
Wow, this is incredible.
This means small, inexpensive Laptops with SL installed can suddenly beat big,fat MacPros without SL.

Well, not small, inexpensive laptops... (do those even exist?), but my Macbook Pro 2.53 with Snow Leopard renders Final Cut Pro footage faster than my 2006 Mac Pro 2.66 that is running on 10.5.8. Of course, if I add Snow Leopard to my Mac Pro, it should even out or be faster. But yeah, this is a good time to have a laptop- it's going to be even better when the apps are coded to optimize Snow Leopard. I never thought I'd see the day a laptop would run faster than my desktop, but it looks like it's here. I might be able to dump the ol' ball and chain pretty soon. :cool:

mmendoza27
Aug 30, 2009, 01:53 PM
2009 MacPro 2.26

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1400 MHz and 32 units/cores
Now computing - please be patient....
time used: 4.113 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
Device 1 is an: CPU with max. 2260 MHz and 16 units/cores
Now computing - please be patient....
time used: 1.140 seconds

Now that doesn't make too much sense.... the 9600M GT has a better time than the GT120? Is that true? Well I guess it's a rebranded 9xxx series right? So maybe... anyways, does anyone have a ATI 4870 to test?

AND LOOOK at the score of the 8-core with hyper-threading! Looks like it bests everything so far!

Prynce
Aug 30, 2009, 01:58 PM
Agreed, thank you. My iMac isn't that old to not be support by Open CL.

src: http://www.macupdate.com/info.php/id/32266/opencl-benchmark

OpenCL GPUs : Macs 2008+ with NVIDIA GPUs (Macbook /MBP/iMac GT120/MacPro)... Macs 2008+ with ATI GPUs - support in test phase

While this doesn't indicate that Apple will support OpenCL on ATI 2600 Pro, you will note that it is possible as per the last part of the note above of support for Macs 2008+ being in a test phase.

Sappharad
Aug 30, 2009, 02:00 PM
These results are totally useless without knowing what exactly it is that is benchmarked.
He includes the source code to his GPU calculation in the readme file.
It appears to just be testing floating point addition.

These results make me want to buy an Nvidia card for my 2008 Mac Pro.

PinkyMacGodess
Aug 30, 2009, 02:00 PM
Waiting for the 9th and ordering my new iMac...

anubis
Aug 30, 2009, 02:00 PM
This app confirms OpenCL support with the 2008 24" iMacs with "GeForce 8800GS". There has been a raging debate over the naming convention of this particular iMac graphic card, fueled by the fact that Apple lists a graphics card similar to what this iMac has in its SL requirements, but Apple has never shipped that graphic card in any Mac. Complicating the debate is the fact that apparently there are 3 or 4 different names for essentially the same graphic card.

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GS
Device 0 is an: GPU with max. 1250 MHz and 64 units/cores
Now computing - please be patient....
time used: 2.302 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU E8435 @ 3.06GHz
Device 1 is an: CPU with max. 3060 MHz and 2 units/cores
Now computing - please be patient....
time used: 12.132 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

flottenheimer
Aug 30, 2009, 02:01 PM
The day OpenCL meets Photoshop + the rest of the Adobe Creative Suite will be a very happy day indeed.

2002cbr600f4i
Aug 30, 2009, 02:04 PM
AND LOOOK at the score of the 8-core with hyper-threading! Looks like it bests everything so far!

Yup... Now think about that for a second... OpenCL will use ALL available devices if you tell it to to do a set of calculations....

So, one of the 2009 Pro's along with a GT120 is just about a 2x improvement over just the CPU alone. Not bad!

I'm wondering if the # of cores, and the clockspeed of the GPUs is heavily influencing this benchmark. I think we all would agree that for gaming, the Radeon 4870 blows away a GT120. However, if you look at the specs that the benchmark is reporting:

GT120 - 32 cores @ 1275 MHz
R4870 - 4 cores @ 750 Mhz

And the fact that the GT120 beats the radeon, performing the test in < 1/2 the time, you have to wonder what's going on.

I suspect that the kinds of calculations this benchmark is doing aren't that strenuous, and also probably don't load up the cache or memory or tax the memory bandwidth on the GPUs.

Shake 'n' Bake
Aug 30, 2009, 02:06 PM
Would it be possible to use Open CL on the Intel GMA-equipped Macs? That would really boost the speed of my mid-'07 Mac mini, which isn't really that old.

Compile 'em all
Aug 30, 2009, 02:06 PM
what a useless article. Synthetic benchmarks of the unknown. jeez.

Dimietriev
Aug 30, 2009, 02:08 PM
Would it be possible to use Open CL on the Intel GMA-equipped Macs? That would really boost the speed of my mid-'07 Mac mini, which isn't really that old.

Here is my '07 macbook with 950GMA, 2GB ram, 2.16Ghz
I think anything would help at this point, but it will never happen, my fellow intel graphics friend.

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz
Device 0 is an: CPU with max. 2160 MHz and 2 units/cores
Now computing - please be patient....
time used: 16.635 seconds

Shake 'n' Bake
Aug 30, 2009, 02:09 PM
Here is my '07 macbook with 950GMA, 2GB ram, 2.16Ghz

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz
Device 0 is an: CPU with max. 2160 MHz and 2 units/cores
Now computing - please be patient....
time used: 16.635 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Being essentially the same as my Mac mini, that's a no for now. I'll test myself to be sure.

2002cbr600f4i
Aug 30, 2009, 02:09 PM
Would it be possible to use Open CL on the Intel GMA-equipped Macs? That would really boost the speed of my mid-'07 Mac mini, which isn't really that old.

If your Mini is presented with OpenCL code, it will only run it on the CPU since the GMA video cards aren't OpenCL supported.

In short, you won't see any performance increase, but you also won't see the programs crap out and die. They'll just work like they always have, running on your CPU.

PinkyMacGodess
Aug 30, 2009, 02:10 PM
Would it be possible to use Open CL on the Intel GMA-equipped Macs? That would really boost the speed of my mid-'07 Mac mini, which isn't really that old.

Is the Intel GMA-based graphics a separate GPU or something that is used by the main processor? If the memory is shared I'd think it's not but don't exactly know. I do know that GMA graphics aren't very fast/capable or recommended for high end graphics like CAD, etc...

Yep. Others have already said no, it's not...

akrantz
Aug 30, 2009, 02:10 PM
Here are my results for comparison:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.263 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU X5570 @ 2.93GHz
Device 1 is an: CPU with max. 2925 MHz and 16 units/cores
Now computing - please be patient....
time used: 0.861 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

jlasoon
Aug 30, 2009, 02:12 PM
Now that doesn't make too much sense.... the 9600M GT has a better time than the GT120? Is that true? Well I guess it's a rebranded 9xxx series right? So maybe... anyways, does anyone have a ATI 4870 to test?

AND LOOOK at the score of the 8-core with hyper-threading! Looks like it bests everything so far!

Ran it again, here you go. Getting conflicting results. It's all over the place.

MacPro 2.26 2009

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1400 MHz and 32 units/cores
Now computing - please be patient....
time used: 1.589 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
Device 1 is an: CPU with max. 2260 MHz and 16 units/cores
Now computing - please be patient....
time used: 1.161 seconds

Shake 'n' Bake
Aug 30, 2009, 02:13 PM
My results:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz
Device 0 is an: CPU with max. 2000 MHz and 2 units/cores
Now computing - please be patient....
time used: 23.028 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]


If your Mini is presented with OpenCL code, it will only run it on the CPU since the GMA video cards aren't OpenCL supported.

In short, you won't see any performance increase, but you also won't see the programs crap out and die. They'll just work like they always have, running on your CPU.

Yes, that's quite obvious. I mean if Apple were to write the necessary stuff, would it work.

Is the Intel GMA-based graphics a separate GPU or something that is used by the main processor? If the memory is shared I'd think it's not but don't exactly know. I do know that GMA graphics aren't very fast/capable or recommended for high end graphics like CAD, etc...

Yep. Others have already said no, it's not...

The VRAM is shared. I'll have to check Wikipedia for the other info.

QCassidy352
Aug 30, 2009, 02:14 PM
I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

agreed. pretty disappointed that my late 2008 imac isn't supported.

Shake 'n' Bake
Aug 30, 2009, 02:16 PM
Well, depending on the chipset, the clock speed can be up to 400 MHz, IMO, enough to make a difference.

I forgot, what chipset is present in the mid-'07 mini?

2002cbr600f4i
Aug 30, 2009, 02:18 PM
I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

See response #11 in this thread... Most likely not going to happen as the hardware doesn't support Double Precision Floating Point.

flopticalcube
Aug 30, 2009, 02:19 PM
Well, depending on the chipset, the clock speed can be up to 400 MHz, IMO, enough to make a difference.

I forgot, what chipset is present in the mid-'07 mini?
GMA950.

Its integrated with the other devices on the motherboard into a single chip but is separate from the CPU. Its not nearly as capable as any of the latest GPUs and lacks many of the hardware features necessary for OpenCL. Never happen.

2002cbr600f4i
Aug 30, 2009, 02:19 PM
Well, depending on the chipset, the clock speed can be up to 400 MHz, IMO, enough to make a difference.

I forgot, what chipset is present in the mid-'07 mini?

GMA950... AKA- JUNK (I have one of these that I've given to my folks)

kresh
Aug 30, 2009, 02:19 PM
Remember the Turbo.264 USB video encoder?

I wonder if it is possible to have a Firewire 400 or Firewire 800 OpenCL device to plug in for those Macs without a compatible video card?

2002cbr600f4i
Aug 30, 2009, 02:21 PM
Remember the Turbo.264 USB video encoder?

I wonder if it is possible to have a firewire 800 OpenCL device to plug in for those Macs without a compatible video card?

Hmmm... Now THAT is an interesting idea, but I don't know how feasible it would be. Maybe as an external box that talked over FW, with it's own memory and power supply and such. I just don't know if even FW800 would have enough bandwidth to support the memory operations. It would suck to have a fast external OpenCL processor, but be crippled by slow interface bottleneck.

Maybe with USB3.0?

PinkyMacGodess
Aug 30, 2009, 02:21 PM
These integrated graphics products allow a computer to be built without a separate graphics card, which can reduce cost, power consumption and noise. They are commonly found on low-priced notebook and desktop computers as well as business computers, which do not need high levels of graphics capability. 90% of all PCs sold have integrated graphics.[1] They rely on the computer's main memory for storage, which imposes a performance penalty as both the CPU and GPU have to access memory over the same bus.

And

Mac OS X 10.4 supports the GMA 950, since it was used in previous revisions of the MacBook and 17-inch iMacs. It has been used in all Intel-based Mac minis (until Mac Mini released on March 3, 2009). Mac OS X 10.5 Leopard contains drivers for the GMA X3100, which were used in a recent revision of the MacBook range.

Late-release versions of Mac OS X 10.4 also support the GMA 900 due to its use in the Apple Developer Transition Kit, which was used in the PowerPC-to-Intel transition. However, special modifications to the kext file must be made to enable Core Image and Quartz Extreme.

Although the new MacBook line no longer uses the X3100, Mac OS X 10.5 (Leopard) ships with drivers supporting it that require no modifications to the kext file. Mac OS X 10.6 (Snow Leopard), which includes a new 64-bit kernel in addition to the 32-bit one, has not yet included any 64-bit X3100 drivers (as of beta build 10A394). This means that although the MacBooks with the X3100 have 64-bit capable processors, Mac OS X must load the 32-bit kernel to support the 32-bit X3100 drivers. The 32-bit kernel is loaded in tandem with the 64-bit version.

The newer MacBook and MacBook Pro notebooks instead ship with a far more powerful NVIDIA GeForce 9400M G, and the 15" and 17" MacBook Pro notebooks ship with an additional GeForce 9600M supporting hybrid power to switch between GPUs.

From Wikipedia.

I doubt that it would be worth it to write OpenCL drivers for the Intel GMA chips as the performance could likely be disappointing...

Shake 'n' Bake
Aug 30, 2009, 02:23 PM
GMA950.

Its integrated with the other devices on the motherboard into a single chip but is separate from the CPU. Its not nearly as capable as any of the latest GPUs and lacks many of the hardware features necessary for OpenCL. Never happen.

I'm talking about the logic board. If it's 945G, 945GC, or 945GZ, the clock speed is 400 MHz which makes a noticeable difference.

GMA950... AKA- JUNK (I have one of these that I've given to my folks)

I realize it's crap, but it's all that I could afford at the time. But even if I had gotten an iMac, I still wouldn't have Open CL support.

I'll have to wait for some real-world tests, but this is really making want an iMac even more.

kresh
Aug 30, 2009, 02:25 PM
Hmmm... Now THAT is an interesting idea, but I don't know how feasible it would be. Maybe as an external box that talked over FW, with it's own memory and power supply and such. I just don't know if even FW800 would have enough bandwidth to support the memory operations. It would suck to have a fast external OpenCL processor, but be crippled by slow interface bottleneck.

Maybe with USB3.0?

hehe I regretted saying it even as I typed it. I normally hate bolt-on solutions and this would be nothing but a second rate OpenCL experience at best, and probably expensive to boot:)

MorphingDragon
Aug 30, 2009, 02:28 PM
OpenCL will only become interesting when you can actually use it in shipping applications - synthetic benchmarks that show "the potential" of OpenCL are just a tease.

They'll come, but it will take time for vendors to recode their apps to use OpenCL.

It also doesn't help that Apple didn't support OpenCL on more GPUs - with the relatively few OpenCL capable systems, there's less incentive for vendors to port.

Also note that some earlier GPUs didn't do full IEEE floating point support (support for NaNs, ±Infinity) , and didn't do accurate floating point. (A small rounding error would not be noticeable on a pixel in a display, but would kill a compute job.)

OpenCL has been released as an open standard. Just like OpenAL and OpenGL. Its up to the GPU vendors to provide the libraries for the other OS's. Just like hmm OpenGL/AL. Apple only chose to support a few on Mac OSX for some reason.

butang
Aug 30, 2009, 02:28 PM
Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.269 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3520 @ 2.67GHz
Device 1 is an: CPU with max. 2659 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.899 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

2002cbr600f4i
Aug 30, 2009, 02:32 PM
ok, I just looked at the readme, and I think this is very telling...

Here's the OpenCL code that is getting sent to the OpenCL devices:

const char * sProgramSource =
__kernel void vectorAdd(
__global const float * a,
__global const float * b,
__global float * c)
{
// Vector element index
int loop;
int nIndex = get_global_id(0);
for (loop=1; loop< 5000; loop++)
{
c[nIndex] = a[nIndex] + b[nIndex];
}

};


ie: It's just doing a simple vector add of 5000 items. This is NOT stressing the memory interfaces or anything like that. These are VERY simple calculations. That would explain why cards with more cores and higher clockspeeds beat out what would normally be considered substantially more capable GPUs.

I wish they had included the full XCode project + source so we could look at it more closely. I'd love to see what sort of performance you could get by telling the code to run on all available OpenCL devices at the same time.

jdm111
Aug 30, 2009, 02:32 PM
Alright, since Apple have screwed me over by not supporting my slightly older than a year MacBook, im going to build a much faster Hackintosh that HAS OpenCL support.

Why should i pay for Apples poor component choice?

hal9000Mark2
Aug 30, 2009, 02:37 PM
MacPro 2008 model - 2x2.8GHz (quad core)

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.684 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.278 seconds

2002cbr600f4i
Aug 30, 2009, 02:39 PM
Alright, since Apple have screwed me over by not supporting my slightly older than a year MacBook, im going to build a much faster Hackintosh that HAS OpenCL support.

Why should i pay for Apples poor component choice?

OMFG... WHY do people get so pissed about this?

Seriously - What does your MB NOT do today that you could do with it yesterday? If OpenCL is your ONLY issue, then I really gotta ask, what programs are you running that you think OpenCL is going to buy you some huge performance improvement with? OpenCL is not a "flip a switch and everything is faster" technology! It's not like a Turbo charger than you bolt onto an engine and suddenly the whole car is faster.

Heck, I'm betting there won't even be any decent OpenCL apps out for a good 6 months at least. By that time, your "slightly more than a year old MacBook" will be around 2 years old... Most people I know don't even keep a laptop more than 3 years because they become so outdated in that time.

By the time you'd see any sort of improvements in the apps you run from OpenCL you're going to be due for a new machine anyhow!

CHILL. BREATHE. You didn't get screwed. You have a perfectly good, useful, viable Macbook that you've already gotten a year's worth of use out of, and it will continue to give you another couple years of use...

adammelancon
Aug 30, 2009, 02:39 PM
No support for my ATI 2600 :(
I'm still upset about that.

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 Duo CPU E8235 @ 2.80GHz
Device 0 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.075 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

padmedala9
Aug 30, 2009, 02:39 PM
Waiting for the 9th and ordering my new iMac...

Please don't. While the 9th is almost assuredly just iPods, new iMacs will quite possibly pop up in October or so. This next revision will probably have quad-core processors and you will kick yourself if you order an iMac now. Wait as long as you possibly can.

2002cbr600f4i
Aug 30, 2009, 02:41 PM
No support for my ATI 2600 :(
I'm still upset about that.

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 Duo CPU E8235 @ 2.80GHz
Device 0 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.075 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Again, look at post #11 in this thread. There ARE reasons. You see that last line: "Validate test passed - " It wouldn't say passed on a 2600 because the 2600 can't do double precision FP math!

Apollo21
Aug 30, 2009, 02:41 PM
Did you not even bother to read the article? It stated clearly that only 1 GPU is active at a time. With the unibody MBP's only 1 GPU can be accessed at a time, and you have to log out to switch between them. I don't get why people think that wouldn't be the case with OpenCL. The OS only sees 1 of the GPUs at a time.

If YOU knew how to read, you would know the following:

1) The person you criticized unnecessarily said: "WHEN I FIRST HEARD ABOUT OpenCL I always kind of assumed..."

2) The article at the top of this thread says: "Most interesting is that for owners of high end MacBook Pros which contain both 9400M and 9600M GT graphics cards, BOTH GPUS CAN BE USED at any time by OpenCL. In contrast, both of these GPUs can not be used for general graphics processing and requires a Mac OS X logout to switch from one to another."

Next time take that @sshole energy and use it for comprehension.

Quu
Aug 30, 2009, 02:43 PM
I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

agreed. pretty disappointed that my late 2008 imac isn't supported.

You will never see OpenCL supported on the HD 2600 or any HD 2xxx series from ATI as they simply do not contain the capability to process this kind of GPGPU data. You need an ATi HD 3xxx, HD 4xxx, NVIDIA 8xxx, 9xxx or 2xx series to be able to process any kind of general code.

So to say it again (as it has been repeated many times) the HD 2600 does not have the capability within it to support any type of OpenCL implementation. It is not Apple artificially limiting which GPUs they allow OpenCL to run on it is that the GPU itself cannot do it.

In ATi's defence. GPGPU was just arriving when they released the HD 2xxx series and its not easy to create hardware to run software that hasn't even been invented yet.

THX1139
Aug 30, 2009, 02:43 PM
Wait as long as you possibly can.

Yeah, that's a great idea! This way you'll be able to afford to live in a nicer old folks home.

2002cbr600f4i
Aug 30, 2009, 02:46 PM
If YOU knew how to read, you would know the following:

1) The person you criticized unnecessarily said: "WHEN I FIRST HEARD ABOUT OpenCL I always kind of assumed..."

2) The article at the top of this thread says: "Most interesting is that for owners of high end MacBook Pros which contain both 9400M and 9600M GT graphics cards, BOTH GPUS CAN BE USED at any time by OpenCL. In contrast, both of these GPUs can not be used for general graphics processing and requires a Mac OS X logout to switch from one to another."

Next time take that @sshole energy and use it for comprehension.


My bad, I my brain must have put the "not" in there in the part about BOTH GPUS CAN (not) BE USED". I'm man enough to admit when I'm wrong!

jlasoon
Aug 30, 2009, 02:47 PM
Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.269 seconds


I want that GTX 285. Now how do I convince the wife? :D

2002cbr600f4i
Aug 30, 2009, 02:48 PM
I want that GTX 285. Now how do I convince the wife? :D

"Honey, would you like that new (insert Coach, Kate Spade, Prada, whatever....) item? How about I get one of those for you and then I get a new video card for me?"

doctoree
Aug 30, 2009, 02:48 PM
GeForce 9600M GT: 2.805 seconds

My GPU:
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.374 seconds

So my "old" early 08 now wipes the floor with these brand new highend MBPs!

And even cooler:
2008 Mac Pro CPU:
OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.278 seconds

It also wipes the floor with these guys. This is like xmas.

001
Aug 30, 2009, 02:48 PM
This is very cool and interesting. On battery power my late 2008 15" MBP's 9400M beats up on the 9600M. But once plugged in the 9600M trounces the 94, without regard or regret.

Check it:
Battery
OpenCL Device # 0 = GeForce 9600M GT
time used: 13.622 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.022 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 2 is an: CPU with max. 2800 MHz and 2 units/cores
time used: 13.102 seconds

Plugged in

OpenCL Device # 0 = GeForce 9600M GT
time used: 2.788 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.028 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
time used: 13.183 seconds

FireArse
Aug 30, 2009, 02:53 PM
This looks somewhat intersting. This is all very arbitrary. I do look forward to H.264 encoding via OpenCL or iSquint / Handbrake / MPlayer making good use of OpenCL.

Must get round to buying an ATI 4870 1GB and flashing it for my first-gen Mac Pro.

marcosscriven
Aug 30, 2009, 02:55 PM
Just tried on my lowly late '07 MBP (8600M GT with 256MB)

CLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 4.579 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.328 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout


With power adapter attached:

/Users/marcosscriven/Downloads/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
Marcos-Scrivens-MacBook-Pro:~ marcosscriven$ /Users/marcosscriven/Downloads/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.362 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.657 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


The thing that's disappointing is that this is just a benchmark. I know this is new, but surely Apple had time to work on actually making something useful to showcase this tech?

I'm surprised Apple didn't build this into their latest Quicktime in Snow Leopard, and be able to show off transcoding (exporting) HD movie files to iphone files around 10 times faster! Now *that* would be useful...

Amethyst
Aug 30, 2009, 02:58 PM
What the hell on earth!!!!

4870 <<< GT120

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.179 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.200 seconds

doctoree
Aug 30, 2009, 03:03 PM
This looks somewhat intersting. This is all very arbitrary. I do look forward to H.264 encoding via OpenCL or iSquint / Handbrake / MPlayer making good use of OpenCL.

Must get round to buying an ATI 4870 1GB and flashing it for my first-gen Mac Pro.

Actually it looks like you should buy a 08 MBP with the 8600m GT on ebay for a couple of bucks.

Edit: Or even a MBP from 07 :), thats crazy

mmendoza27
Aug 30, 2009, 03:04 PM
Please don't. While the 9th is almost assuredly just iPods, new iMacs will quite possibly pop up in October or so. This next revision will probably have quad-core processors and you will kick yourself if you order an iMac now. Wait as long as you possibly can.

I don't think that new iMac's (if released this year) will have a quad-core processor. Intel roadmaps show that the TDP is way too high, however if you get a Core i5 in a new iMac, you will have hyperthreading, which would be like 4 virtual cores.

And for PinkyMacGoddess, if you are awaiting in hopes of getting a new iPod touch with your Mac due to education discount, it won't happen. Apple only allows you to get the previous generation for that promotion.

mmendoza27
Aug 30, 2009, 03:06 PM
What the hell on earth!!!!

4870 <<< GT120

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.179 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.200 seconds

Keep in mind, this is a very simple test that doesn't stress memory, it's basically the more cores and clock speed you have, the faster the test runs. That's why a 8-core Mac Pro with hyper-threading is doing better than some video cards.

DUSTmurph
Aug 30, 2009, 03:11 PM
i think since I've installed SL, (about 40 mins ago) my GPU runs hotter. My guess would be from OpenCL.

mdriftmeyer
Aug 30, 2009, 03:13 PM
Ok,this was just run on my 2009 Mac Pro, 2.93 Ghz Quad w/HT ON + 4870 GPU:

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.244 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3540 @ 2.93GHz
Device 1 is an: CPU with max. 2925 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.834 seconds

And from my old Macbook 2,1:

OpenCL Device # 0 = Intel(R) Core(TM)2 CPU T7400 @ 2.16 GHz
Device 0 is an: CPU with max. 2160 MHz and 2 units/cores
Now computing - please be patient....
time used: 17.149 seconds


Interesting that the CPU beats the 4870 on the Pro!

I also tried the AOBench one.... Unfortunately when I try to run it against the GPU, it gives the following error:



CL_DEVICE_NAME: Radeon HD 4870
CL_DEVICE_VENDOR: AMD
Error: Failed to build program executable
cvmsErrorCompilerFailure: LLVM compiler has failed to compile a function.
logout


On the plus side, they include the source for the AOBench stuff, and the XCode project, so you can fiddle with it and see how OpenCL code is written.

Something is considerably flawed with the ATi support and Apple's status of leveraging it with their code base.

The 4870 has 800 streams thus approximately 155 units/core.

LLVM 2.6 is in final stages of being released.

http://llvm.org/Users.html#Apple


Mac OS X 10.6 (and later): The OpenCL GPGPU implementation is built on Clang and LLVM compiler technology. This requires parsing an extended dialect of C at runtime and JIT compiling it to run on the CPU, GPU, or both at the same time. In addition, several performance sensitive pieces of Mac OS X 10.6 were built with llvm-gcc such as OpenSSL and Hotspot. Finally, the compiler_rt library has replaced libgcc and is now a part of libsystem.dylib.

Shake 'n' Bake
Aug 30, 2009, 03:14 PM
i think since I've installed SL, (about 40 mins ago) my GPU runs hotter. My guess would be from OpenCL.

You wish. Nothing on your Mac supports Open CL.

adammelancon
Aug 30, 2009, 03:14 PM
Again, look at post #11 in this thread. There ARE reasons. You see that last line: "Validate test passed - " It wouldn't say passed on a 2600 because the 2600 can't do double precision FP math!

Oh, I'm well aware of the technical reasons. It doesn't mean that I have to be happy about it. ;)

2002cbr600f4i
Aug 30, 2009, 03:16 PM
Keep in mind, this is a very simple test that doesn't stress memory, it's basically the more cores and clock speed you have, the faster the test runs. That's why a 8-core Mac Pro with hyper-threading is doing better than some video cards.

Agreed. This is pretty much a worthless benchmark. There's nothing complex taking place, no hard memory thrashing, no difficult calculations.

It's just taking 2 arrays with 5000 numbers in them and adding them together into a new array of 5000 numbers. Simple atomic add operations of 2 numbers over and over and over. The more cores you have the more you can split the array up (4 cores = each core processes 1250 items, 32 cores = each core processes 156.25 items.) and the faster clock means that each item gets processed faster.

This is NOT AT ALL indicative of a real OpenCL app that will be doing hundreds of thousands of difficult computations, with dependancies between the dat and working across huge datasets.

I'd say ignore every result we get out of this app. It's not at all indicative of real-world performance in the least.

imm22
Aug 30, 2009, 03:17 PM
Think My results are a little higher than normal:

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.786 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 2.936 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
Device 2 is an: CPU with max. 2530 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.239 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Why?

Amdahl
Aug 30, 2009, 03:17 PM
OMFG... WHY do people get so pissed about this?

Seriously - What does your MB NOT do today that you could do with it yesterday?

What it doesn't do today is allow the owner to dream about how wonderful and better the MB is going to be tomorrow when High Priestess Steve dispenses the next received wisdom (via Software Update) that will manifest in their blessed and annointed Macbook.

Yesterday, it did that.

And the poster is right. Go Hackintosh. Apple isn't interested in you, unless you have cash in your wallet, and you're in the store right now. Tomorrow, you're not a customer.

MarcBook
Aug 30, 2009, 03:27 PM
Has anyone else noticed that the 8600M GT runs at two different clock speeds depending on which size MacBook Pro you've got?

At least that's the impression I'm getting. Mine runs at 940MHz, whereas some users have reported 1040MHz. Mine is a 15" 2007 MacBook Pro and I'm guessing the higher clock speed is from the 17"?

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.980 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.600 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

djrod
Aug 30, 2009, 03:28 PM
This is very cool and interesting. On battery power my late 2008 15" MBP's 9400M beats up on the 9600M. But once plugged in the 9600M trounces the 94, without regard or regret.

Check it:
Battery
OpenCL Device # 0 = GeForce 9600M GT
time used: 13.622 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.022 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 2 is an: CPU with max. 2800 MHz and 2 units/cores
time used: 13.102 seconds

Plugged in

OpenCL Device # 0 = GeForce 9600M GT
time used: 2.788 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.028 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
time used: 13.183 seconds

I asume we have the same Macbook PRO, but why are my 9400M results so much betters than yours?


Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.493 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 12.962 seconds


EDIT: If I turn the 9600GT On, the OpenCL results of the 9400M falls down dramatically and it matches yours:

OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.785 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.022 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 2 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.376 seconds

J the Ninja
Aug 30, 2009, 03:28 PM
Agreed. This is pretty much a worthless benchmark. There's nothing complex taking place, no hard memory thrashing, no difficult calculations.

It's just taking 2 arrays with 5000 numbers in them and adding them together into a new array of 5000 numbers. Simple atomic add operations of 2 numbers over and over and over. The more cores you have the more you can split the array up (4 cores = each core processes 1250 items, 32 cores = each core processes 156.25 items.) and the faster clock means that each item gets processed faster.

This is NOT AT ALL indicative of a real OpenCL app that will be doing hundreds of thousands of difficult computations, with dependancies between the dat and working across huge datasets.

I'd say ignore every result we get out of this app. It's not at all indicative of real-world performance in the least.

This "Galaxies" benchmark netkas posted is WAYY more fun. And a little more useful, check it out:

http://netkas.org/?p=164

macfan881
Aug 30, 2009, 03:32 PM
Waiting for the 9th and ordering my new iMac...

you know the event on the 9th is just a ipod event.

Bubba Satori
Aug 30, 2009, 03:34 PM
you know the event on the 9th is just a ipod event.

It's a shame the Constitution restricts a trillion dollar company to releasing one product at a time, three times a year. :rolleyes:

Aron Peterson
Aug 30, 2009, 03:38 PM
Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.978 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
Device 1 is an: CPU with max. 2200 MHz and 2 units/cores
Now computing - please be patient....
time used: 17.748 seconds

MacRumorUser
Aug 30, 2009, 03:41 PM
MacPro Rev 1.1 2006 2.66Ghz


Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.688 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
Device 1 is an: CPU with max. 2660 MHz and 4 units/cores
Now computing - please be patient....
time used: 6.802 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout


Wowsers :eek::)

Krafty
Aug 30, 2009, 03:48 PM
09 Mac Mini + 9400MLast login: Sun Aug 30 00:01:17 on console

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 6.779 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU P7350 @ 2.00GHz
Device 1 is an: CPU with max. 2000 MHz and 2 units/cores
Now computing - please be patient....
time used: 19.269 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

macfan881
Aug 30, 2009, 03:49 PM
It's a shame the Constitution restricts a trillion dollar company to releasing one product at a time, three times a year. :rolleyes:


ummm ok....

2002cbr600f4i
Aug 30, 2009, 03:54 PM
This "Galaxies" benchmark netkas posted is WAYY more fun. And a little more useful, check it out:

http://netkas.org/?p=164

Doesn't seem to support running on ATI video cards for the OpenCL stuff... It'll only use Radeons for rendering, not calculations.

kornyboy
Aug 30, 2009, 04:09 PM
Wirelessly posted (iPhone: Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7A400 Safari/528.16)

The results look very promising. I look forward to seeing some more real world results.

Master Chief
Aug 30, 2009, 04:14 PM
The benchmark results will soon change and improve, because Apple is working on improvements for OpenCL, and a first update should be made available in the coming weeks!

andy721
Aug 30, 2009, 04:14 PM
C☣mp Specs:
• Mac Pro Two 2.8GHz Quad-Core Intel Xeon Harpertown processors (8 Cores)
• 12MB of L2 cache per processor (6MB shared per pair of cores)
• 1600MHz dual independent frontside buses
• 6GB memory (800MHz DDR2 fully-buffered DIMM ECC)
• NVIDIA GeForce 8800 GT with 512MB of GDDR3 video memory.
• 320GB Serial ATA 3Gb/s 7200-rpm hard drive1
• 16x double-layer SuperDrive
• Apple 20" Cinema Display

VIDEO:
https://dl.getdropbox.com/u/1828591/Screen%20Recording%202.mov
https://dl.getdropbox.com/u/1828591/Screen%20Recording%203.mov

Howmanoid
Aug 30, 2009, 04:20 PM
OpenCL will only become interesting when you can actually use it in shipping applications - synthetic benchmarks that show "the potential" of OpenCL are just a tease.

They'll come, but it will take time for vendors to recode their apps to use OpenCL.



Isn't Grand Central supposed to be able to handle the scheduling of current tasks to any compute resource it can manage without the need for recoding? Sure you get greater benefits from writing specifically for OpenCL but I seem to remember Bertrand Serlet showing an animation of how tasks (code segments, not whole processes) were managed across all available compute resources based on priority and systems load. If that's true then it would imply that OpenCL should give benefits to current code as well as code written with OpenCL blocks. I thought this was supposed to be one of its big advantages over CUDA.

Anyone got any more info on this?

unit22
Aug 30, 2009, 04:20 PM
What it doesn't do today is allow the owner to dream about how wonderful and better the MB is going to be tomorrow when High Priestess Steve dispenses the next received wisdom (via Software Update) that will manifest in their blessed and annointed Macbook.

Yesterday, it did that.

And the poster is right. Go Hackintosh. Apple isn't interested in you, unless you have cash in your wallet, and you're in the store right now. Tomorrow, you're not a customer.


If you're not buying anything you're not a customer. Businesses don't tend to bend on this apart from as a sales tactic.

When you buy a computer you're buying a product, and apart from after sales support and warranty it really isn't an ongoing service.

However, turns out my G4 Mac Mini is still good, borderline great for editing movies. Score.

Brad Larson
Aug 30, 2009, 04:25 PM
This "Galaxies" benchmark netkas posted is WAYY more fun. And a little more useful, check it out:

http://netkas.org/?p=164

I believe this "Galaxies" demo is the sample code that Apple has posted for an N-body simulation. If you are a developer, you can grab the code and compile it yourself from here:

https://developer.apple.com/mac/library/samplecode/OpenCL_NBody_Simulation_Example/index.html

There's also the OpenCL Procedural Grass and Terrain example:

http://developer.apple.com/mac/library/samplecode/OpenCL_Procedural_Grass_and_Terrain_Example/index.html

If you're interested in more about OpenCL, Dr. Gohara is doing a video series on the topic at MacResearch:

http://www.macresearch.org/opencl

The demo near the end of the first video is worth watching.

Brad Larson
Aug 30, 2009, 04:36 PM
Isn't Grand Central supposed to be able to handle the scheduling of current tasks to any compute resource it can manage without the need for recoding? Sure you get greater benefits from writing specifically for OpenCL but I seem to remember Bertrand Serlet showing an animation of how tasks (code segments, not whole processes) were managed across all available compute resources based on priority and systems load. If that's true then it would imply that OpenCL should give benefits to current code as well as code written with OpenCL blocks. I thought this was supposed to be one of its big advantages over CUDA.

Anyone got any more info on this?

OpenCL and Grand Central Dispatch (GCD) are two different technologies. OpenCL is for taking relatively simple calculations and running them across large data sets over all available computing resources (GPU and / or CPU). GCD allows programmers to break up tasks like sorting arrays or handling multiple downloads, then lets the system load-balance them across CPU cores by creating and managing threads.

GCD is more about simplifying the code for taking advantage of multicore systems. It's behind many of the performance improvements you already see in shipping applications like Mail.app.

CUDA is a more device-specific implementation of GPU computing, and it was the template for the design of OpenCL. However, OpenCL supports far more devices than CUDA, and even lets you perform work on the CPU.

Rubberband Man
Aug 30, 2009, 04:37 PM
Hrm. Tried it on my MBP & looks fine. Tried it on my 2009 Macmini and its not quite right.

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
Error: clEnqueueReadBuffer for device # 0
ERROR NUMBER = -36



Tried it a bunch more times and finally comes up with a number, but its s-l-o-w.

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 15.325 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU P7350 @ 2.00GHz
Device 1 is an: CPU with max. 2000 MHz and 2 units/cores
Now computing - please be patient....
time used: 18.004 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

techfreak85
Aug 30, 2009, 04:38 PM
Booo! More ATI GPUs please. Yes, I'm talking to you, AMD.

If I do recall, you have the same iMac as me no? First gen aluminum iMac? ATi Radon 2600?

Im not that happy.... its not a bad card, and its certainly not obsolete.:mad:

netkas
Aug 30, 2009, 04:45 PM
MacRumors, why didnt you write that radeonhd has very poor support for opencl?

you can run only two (one of them is hello worlds) of many opencl sample from osx sdk on Radeon 4870.

much more interesting benchmark from samples - Galaxies , find it at http://netkas.org/?p=164#comment-35793

right bottom button allows to select opencl device


Doesn't seem to support running on ATI video cards for the OpenCL stuff... It'll only use Radeons for rendering, not calculations.

thats because opencl on radeons is very poor, so it cant run calculations of galaxy app

daveporter
Aug 30, 2009, 04:50 PM
I was hoping that my trusty old GeForce 7300GT that came with my first generation MacPro would be supported.

Does anyone know which of the supported video cards can be installed in a first generation MacPro? I know that some of the new cards that are made for MacPros are not compatible with the first generation unit.

Thanks,

Dave

2002cbr600f4i
Aug 30, 2009, 05:04 PM
MacRumors, why didnt you write that radeonhd has very poor support for opencl?

you can run only two (one of them is hello worlds) of many opencl sample from osx sdk on Radeon 4870.

much more interesting benchmark from samples - Galaxies , find it at http://netkas.org/?p=164#comment-35793

right bottom button allows to select opencl device




thats because opencl on radeons is very poor, so it cant run calculations of galaxy app

Yeah, geesh, I just DL'ed every one of the OpenCL examples from Apple's developer site.

1/2 failed to compile, the other 1/2 wouldn't even run on my setup....

I hope they get this stuff fixed soon!

markm49uk
Aug 30, 2009, 05:06 PM
You will never see OpenCL supported on the HD 2600 or any HD 2xxx series from ATI as they simply do not contain the capability to process this kind of GPGPU data. You need an ATi HD 3xxx, HD 4xxx, NVIDIA 8xxx, 9xxx or 2xx series to be able to process any kind of general code.

So to say it again (as it has been repeated many times) the HD 2600 does not have the capability within it to support any type of OpenCL implementation. It is not Apple artificially limiting which GPUs they allow OpenCL to run on it is that the GPU itself cannot do it.

In ATi's defence. GPGPU was just arriving when they released the HD 2xxx series and its not easy to create hardware to run software that hasn't even been invented yet.

Yes I understand now - obviously I was unaware of this when I made my post.

Looks like I will just have to rely on my MacBook Pro :D:

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.793 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.030 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.142 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

IEatApples
Aug 30, 2009, 05:12 PM
Here are my results for comparison:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.263 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU X5570 @ 2.93GHz
Device 1 is an: CPU with max. 2925 MHz and 16 units/cores
Now computing - please be patient....
time used: 0.861 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.269 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3520 @ 2.67GHz
Device 1 is an: CPU with max. 2659 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.899 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

MacPro 2008 model - 2x2.8GHz (quad core)

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.684 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.278 seconds

I want that GTX 285. Now how do I convince the wife? :D

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 285
Device 0 is an: GPU with max. 1476 MHz and 240 units/cores
Now computing - please be patient....
time used: 3.549 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 4 units/cores
Now computing - please be patient....
time used: 6.426 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

………………………………………………………………………………………………

I don't understand why mine is so "slow" compared to the others with GTX 285? :confused:

DUSTmurph
Aug 30, 2009, 05:21 PM
You wish. Nothing on your Mac supports Open CL.

YOu should check again. My Nvidia GeForce 8800 GS does support OpenCL. I even have this thing running the 64bit kernel.

doctoree
Aug 30, 2009, 05:49 PM
YOu should check again. My Nvidia GeForce 8800 GS does support OpenCL. I even have this thing running the 64bit kernel.

I think "Nothing" referred to software, not hardware.

*LTD*
Aug 30, 2009, 06:12 PM
OpenCL
requires one of the following graphics cards or graphics processors:

NVIDIA

GeForce 9400M
GeForce 9600M GT
GeForce 8600M GT
GeForce GT 120
GeForce GT 130
GeForce GTX 285
GeForce 8800 GT
GeForce 8800 GS
Quadro FX 4800
Quadro FX5600

ATI

Radeon 4850
Radeon 4870

namethisfile
Aug 30, 2009, 06:20 PM
You will never see OpenCL supported on the HD 2600 or any HD 2xxx series from ATI as they simply do not contain the capability to process this kind of GPGPU data. You need an ATi HD 3xxx, HD 4xxx, NVIDIA 8xxx, 9xxx or 2xx series to be able to process any kind of general code.

So to say it again (as it has been repeated many times) the HD 2600 does not have the capability within it to support any type of OpenCL implementation. It is not Apple artificially limiting which GPUs they allow OpenCL to run on it is that the GPU itself cannot do it.

In ATi's defence. GPGPU was just arriving when they released the HD 2xxx series and its not easy to create hardware to run software that hasn't even been invented yet.

someone mentioned the hd2600 not being capable of double precision fp. is this the missing capability hindering it from being open cl capable? aren't there only a handful of gpu's (if that) w/ double precision floating point capabilty? if so, why would open cl only work on such limited number of machines?

and lastly, how are you SO sure that hd2xxx will never, ever, evereverever be supported? where are you getting this info from? do you know something we don't?

also, you mentioned that "You need an ATi HD 3xxx, HD 4xxx, NVIDIA 8xxx, 9xxx or 2xx series to be able to process any kind of general code."

this is misleading. you say "general code" instead of something else. unless you explain yourself, i am regarding you as a charlatan and nothing more.

max-bear
Aug 30, 2009, 06:23 PM
Hi I just tried this in my ageing mac pro.

The 4780 works which I didn't expect in an unsupported machine, but my dual 2.0 GHz Xeon's get a segmentation fault!!? :mad:

What does this mean?:confused:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.204 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
Device 1 is an: CPU with max. 2000 MHz and 4 units/cores
Now computing - please be patient....
Segmentation fault
logout

[Process completed]

fleshman03
Aug 30, 2009, 06:25 PM
My Results, for comparison:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.929 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7800 @ 2.60GHz
Device 1 is an: CPU with max. 2600 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.840 seconds

Sweet. Now all us 8600 owners have to do is pray that our cards don't burn up....

Seriously, I want to take advantage of this. Are there any that currently do? I'm imagining a version of handbreak that can convert my .eyetv files in a few minutes instead of an hour....

akrantz
Aug 30, 2009, 06:26 PM
I don't understand why mine is so "slow" compared to the others with GTX 285? :confused:

You have fewer cores than the others? Still, I am also surprised the difference is that large, but I don't claim to fully understand how the technology works either.

IEatApples
Aug 30, 2009, 06:41 PM
You have fewer cores than the others? Still, I am also surprised the difference is that large, but I don't claim to fully understand how the technology works either.

Yes, I've noticed that, but here's a 8800GT beating my GTX 285 with a large amount! :eek:

MacPro 2008 model - 2x2.8GHz (quad core)

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.684 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.278 seconds

I've also noticed a lot of other less powerful GPU's beating my GTX 285… something must be wrong… :(

dicklacara
Aug 30, 2009, 06:47 PM
Without reading through all the posts, I want to know if a dedicated h264 encoder/decoder processor is something that could be used by OpenCL?

hugodrax
Aug 30, 2009, 07:04 PM
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.195 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
Device 1 is an: CPU with max. 2660 MHz and 4 units/cores
Now computing - please be patient....
Segmentation fault


I got this on a 06 mac pro with apple 4870

The 4870 seems to be a suckass OpenCL board

dicklacara
Aug 30, 2009, 07:15 PM
It's a shame the Constitution restricts a trillion dollar company to releasing one product at a time, three times a year. :rolleyes:

It's not the Constitution it's the Federal agencies, like:

FCC, BFD, LMAO, NASA, RTFM, IRS, GFU, CIA, OMFG, SOX, AFAIK....

hefeglass
Aug 30, 2009, 07:19 PM
REWRITE handbrake for openCL....my mini would HAUL on conversions..

right now, i use my i7 system because I can do a full movie in 12 min..
hope the mini will be faster on converting soon enough!

Chop69
Aug 30, 2009, 07:24 PM
REWRITE handbrake for openCL....my mini would HAUL on conversions..

right now, i use my i7 system because I can do a full movie in 12 min..
hope the mini will be faster on converting soon enough!

Judging by what the developers have said over on the Handbrake forums, they don't plan on using OpenCL anytime soon. Or Grand Central...

http://forum.handbrake.fr/viewtopic.php?f=5&t=11239

Chop69
Aug 30, 2009, 07:28 PM
I asume we have the same Macbook PRO, but why are my 9400M results so much betters than yours?


Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 3.493 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 12.962 seconds


EDIT: If I turn the 9600GT On, the OpenCL results of the 9400M falls down dramatically and it matches yours:

OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.785 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.022 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 2 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.376 seconds


I was trying to figure this out as well, because I was getting results similar to 001's, but now I'm getting results similar to yours. The only thing that changed was when I got the lower 9400 score, my MBP was plugged in, but not fully charged. When the charge completed, I got a 9400 score in the 3 sec range.

VirtualRain
Aug 30, 2009, 07:30 PM
I wouldn't put a lot of faith in this benchmark.

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.228 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3540 @ 2.93GHz
Device 1 is an: CPU with max. 2925 MHz and 8 units/cores
Now computing - please be patient....
Segmentation fault

It thinks the 4870 only has 4 cores and it can't even run without an error on my new XEON.

ovrlrd
Aug 30, 2009, 07:32 PM
The more you are using your GPU for other things the slower OpenCL runs. This is probably pretty obvious to some people but it might explain some of the benchmarks being odd. Like if you have a bunch of windows showing on your desktop, that is using your GPU more, if you then add a video playing in the background, that adds more. The best way to run this benchmark is to minimize the amount of apps running in the background.

My GTX 285 was showing 3 seconds when I had a Remote Desktop session open, but after I closed it I saw it running at .2 seconds.

Ultimately though as many pointed out this benchmark doesn't really mean anything. I can't wait to see some real OpenCL apps released.

*LTD*
Aug 30, 2009, 07:34 PM
Judging by what the developers have said over on the Handbrake forums, they don't plan on using OpenCL anytime soon. Or Grand Central...

http://forum.handbrake.fr/viewtopic.php?f=5&t=11239

Just leaves the door open for someone else to do it. OS X development has taken off. Even MS seems to be paying attention to its Mac offerings. Either developers evolve their apps or no one will bother using them.

Erasmus
Aug 30, 2009, 08:01 PM
These results lead me to the idea:

For many people now, the 8 core Mac Pro would be pointless. Simply go for the entry level quad, throw in four GT 120's, and you'll get far better performance using OpenCL than you would for the top of the line 8 core, for far less cost. Spend the money on RAM.

All we really need now is for pro applications like Matlab to use OpenCL, and the dreams of Scientists and Engineers like me across the globe will come true!

stevemiller
Aug 30, 2009, 08:03 PM
from the SL features page:

"Because it’s built into the heart of Snow Leopard, QuickTime X uses Mac OS X technologies such as Cocoa, Grand Central Dispatch, and 64-bit computing to deliver greatest-possible performance and enables QuickTime Player to launch up to 2.4x faster."

does anyone else find it strange that apple has added support for opencl in their OS but doesn't seem to have utilized it in any of their actual software? the new media encoding tools in quicktime X seem like a prime candidate for leveraging the gpu, but it seems that only h264 decoding is gpu accelerated.

it doesn't feel very encouraging for the adoption of the technology when they don't even use it themselves.

full-disclosure: i do media stuff for a living and i'm mostly just chomping at the bit for some real-world evidence that opencl might give my system an added boost!

eMagine
Aug 30, 2009, 08:19 PM
Here's mine:

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.726 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU 5150 @ 2.66GHz
Device 1 is an: CPU with max. 2660 MHz and 4 units/cores
Now computing - please be patient....
time used: 7.324 seconds

twoodcc
Aug 30, 2009, 08:31 PM
alright! my macbook pro 15" is one of the machines that i still have to install snow leopard on

itsthenewdc
Aug 30, 2009, 08:59 PM
Yes, I've noticed that, but here's a 8800GT beating my GTX 285 with a large amount! :eek:



I've also noticed a lot of other less powerful GPU's beating my GTX 285… something must be wrong… :(

Mine is the same way.. not sure what's up.. Glad to see it's not just mine though.

DAMNiatx
Aug 30, 2009, 09:19 PM
Did you not even bother to read the article? It stated clearly that only 1 GPU is active at a time. With the unibody MBP's only 1 GPU can be accessed at a time, and you have to log out to switch between them. I don't get why people think that wouldn't be the case with OpenCL. The OS only sees 1 of the GPUs at a time.

come on, you never try this thing.
Snow Leopard can use BOTH GPU at a time. no need to log out anymore.
just change to performance mode. you can use both gpu with opencl

pilotError
Aug 30, 2009, 09:30 PM
I wanted to join the party!


...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 260
Device 0 is an: GPU with max. 1242 MHz and 192 units/cores
Now computing - please be patient....
time used: 0.357 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
Device 1 is an: CPU with max. 3800 MHz and 4 units/cores
Now computing - please be patient....
time used: 6.403 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

bommai
Aug 30, 2009, 09:34 PM
I agree - hopefully support for my iMac 24" ATI Radeon HD2600 is coming ?

Me too.

Shawn Parr
Aug 30, 2009, 09:38 PM
Did you not even bother to read the article? ...
My bad, I my brain must have put the "not" in there in the part about BOTH GPUS CAN (not) BE USED". I'm man enough to admit when I'm wrong!

I just wanted to say, in the pseudo anonymity of the internet very few have the cojones to actually admit when they were wrong. I really appreciate it, and wanted to say thanks.

Update: I just switched over the 9400m on my 2008 MBP and tried it again. As you can see it only ran the test on the 9400m and not the 9600m. So it looks like OpenCL only works with both GPUs on the 2008 models when you are running the 9600m GPU.

Intriguing. I'll have to try this in a bit. I was a bit disappointed that I couldn't use both on this beast of a machine...

Shawn Parr
Aug 30, 2009, 09:50 PM
come on, you never try this thing.
Snow Leopard can use BOTH GPU at a time. no need to log out anymore.
just change to performance mode. you can use both gpu with opencl

That's not true on my machine, I just had to log in and out both times when switching graphics chips.

I was trying to figure this out as well, because I was getting results similar to 001's, but now I'm getting results similar to yours. The only thing that changed was when I got the lower 9400 score, my MBP was plugged in, but not fully charged. When the charge completed, I got a 9400 score in the 3 sec range.

I saw a big difference in my 9600 scores plugged vs unplugged:

unplugged:

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 13.587 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.019 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
Device 2 is an: CPU with max. 2530 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.623 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


plugged:

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.788 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.025 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
Device 2 is an: CPU with max. 2530 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.564 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


But when using the 9600 it definitely can see/use both GPUs. I just wish it could do the same when using the 9400, or maybe give a preference for it. I usually only use the 9400 as I don't need the extra graphics performance, but I wouldn't mind using the 9600 from time to time for OpenCL tasks, especially if I'm plugged in.

ungraphic
Aug 30, 2009, 10:32 PM
They better drop some ATI 3870 support for openCL. First apple couldnt get it right shipping the geforce 8800GT for 1,1 Mac Pros, now they cant even get proper support for OpenCL compatible cards. WTF?

How many of you guys are getting SHAFTED by apple with aftermarket products?

John.B
Aug 30, 2009, 10:53 PM
does anyone else find it strange that apple has added support for opencl in their OS but doesn't seem to have utilized it in any of their actual software? the new media encoding tools in quicktime X seem like a prime candidate for leveraging the gpu, but it seems that only h264 decoding is gpu accelerated.
Ummmm, because it's new and not very many people will be able to take advantage of it on day 1?

it doesn't feel very encouraging for the adoption of the technology when they don't even use it themselves.
I expect stuff in the Pro apps to begin optionally adopting this in their next version. I'm thinking applications like Compressor would be a good choice for this sort of thing.

FWIW, Wil Shipley wrote an interesting piece on this sort of decision making/prioritizing process (http://wilshipley.com/blog/2009/08/pimp-my-code-part-16-heuristics-and.html) recently that's worth the read if you have the time to invest in it. A good window into the (IMO) correct decision making process for software development.

full-disclosure: i do media stuff for a living and i'm mostly just chomping at the bit for some real-world evidence that opencl might give my system an added boost!
I do have a couple of your albums. Always loved that B3 part on Fly Like an Eagle. And many, many years ago I played in a band that did a great Space Cowboy->Space Truckin'->Space Cowboy medley. ;)

itsthenewdc
Aug 30, 2009, 10:55 PM
I've also noticed a lot of other less powerful GPU's beating my GTX 285… something must be wrong… :(

Found the reason behind my problem at least. Went to nVidia's site and put in my 285 on the drivers page. They have a separate CUDA driver that I downloaded and installed and now I get the .2xxx results :]

Scottsdale
Aug 30, 2009, 11:01 PM
I noticed something here.

Those with MBPs which have the Nvidia 9400m are showing 1100 MHz and 16 units/cores.

My MacBook Air 2.13 GHz with 9400m shows 800 MHz and 16 units/cores.

The weird thing is, mine is showing exactly same scores in 9.2 second range of time for the GPU to run test.

What exactly does this mean? The MBA rev B did show 4x GPU performance of original MBA. However, this newest model shows 6x GPU performance of original MBA. Is the MBA's GPU being throttled? It does show my CPU running at 2.13 GHz.

I haven't seen what some with the 9400m in an iMac, Mac mini, and MB are showing for both the clock speed and time for GPU to run test.

Interested in sharing data with others and trying to determine what Apple is doing with the same 9400m GPU in the MacBook Air.

ayeying
Aug 30, 2009, 11:34 PM
I noticed something here.

Those with MBPs which have the Nvidia 9400m are showing 1100 MHz and 16 units/cores.

My MacBook Air 2.13 GHz with 9400m shows 800 MHz and 16 units/cores.

The weird thing is, mine is showing exactly same scores in 9.2 second range of time for the GPU to run test.

What exactly does this mean? The MBA rev B did show 4x GPU performance of original MBA. However, this newest model shows 6x GPU performance of original MBA. Is the MBA's GPU being throttled? It does show my CPU running at 2.13 GHz.

I haven't seen what some with the 9400m in an iMac, Mac mini, and MB are showing for both the clock speed and time for GPU to run test.

Interested in sharing data with others and trying to determine what Apple is doing with the same 9400m GPU in the MacBook Air.

It's proven that its been the GPU is throttled.

The other 9400M video cards should be 550MHz or 1100MHz (as read by the benchmark).

For some reason, we have a 400MHz or 800MHz (as read by the benchmark)

but in reality, we should have a 300MHz or 350MHz (depending on the drivers in Windows) video card in Windows and the others (MB/iMac/Mini, etc) should have 450MHz core speed.

Here's what I got:

MBA-SLMac-JY:~ Jimmy$ /Users/Jimmy/Downloads/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V025
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 9400M
Device 0 is an: GPU with max. 800 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.692 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU L9600 @ 2.13GHz
Device 1 is an: CPU with max. 2130 MHz and 2 units/cores
Now computing - please be patient....
time used: 17.224 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

macintoshtoffy
Aug 30, 2009, 11:35 PM
Is the Intel GMA-based graphics a separate GPU or something that is used by the main processor? If the memory is shared I'd think it's not but don't exactly know. I do know that GMA graphics aren't very fast/capable or recommended for high end graphics like CAD, etc...

Yep. Others have already said no, it's not...

I'd probably say in some cases, even if GMA could support OpenCL, would probably perform worse than the CPU itself.

Erasmus
Aug 30, 2009, 11:40 PM
I'd probably say in some cases, even if GMA could support OpenCL, would probably perform worse than the CPU itself.

Possibly, but it's still extra resources that the computer won't use. Better to have something that barely works than nothing at all.

JFreak
Aug 31, 2009, 12:34 AM
Is this the June-2007 Santa Rosa model? Seems my old MBP gets the OpenCL love after all :)


Just tried on my lowly late '07 MBP (8600M GT with 256MB)

CLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 4.579 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.328 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout


With power adapter attached:

/Users/marcosscriven/Downloads/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
Marcos-Scrivens-MacBook-Pro:~ marcosscriven$ /Users/marcosscriven/Downloads/OpenCLBench_as_terminal_tool/OpenCL2_Bench_V025 ; exit;
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.362 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.657 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


The thing that's disappointing is that this is just a benchmark. I know this is new, but surely Apple had time to work on actually making something useful to showcase this tech?

I'm surprised Apple didn't build this into their latest Quicktime in Snow Leopard, and be able to show off transcoding (exporting) HD movie files to iphone files around 10 times faster! Now *that* would be useful...

JFreak
Aug 31, 2009, 12:38 AM
Sweet. Now all us 8600 owners have to do is pray that our cards don't burn up....

....or does so while under warranty :D

JFreak
Aug 31, 2009, 12:43 AM
The day OpenCL meets Photoshop + the rest of the Adobe Creative Suite will be a very happy day indeed.

Sadly, that day may never come. Adobe currently maintains cross-platform compatibility and does not use Apple/MS proprietary API's to their fullest. They want the two versions to share code as much as possible.

They should just re-write much of their stuff to make the EXISTING features shine. Even if that means zero new features. I'd love it if the CS5 would finally be one worth buying and sticking to for some time...

(loved CS1 until it became obvious PPC would be obsoleted, reluctantly bought into CS3 only to recently find out it's not supported on SL.)

joelypolly
Aug 31, 2009, 12:44 AM
Possibly, but it's still extra resources that the computer won't use. Better to have something that barely works than nothing at all.
If something barely works it is not worth the time and effort to certify it. If you are going to do something badly I would prefer you not to do it at all.

Erasmus
Aug 31, 2009, 01:00 AM
If something barely works it is not worth the time and effort to certify it. If you are going to do something badly I would prefer you not to do it at all.

Obviously by "badly" I mean slowly, not wrong. And considering the 5 fold speed increase (in this crappy benchmark, granted) between the GPU and CPU in a Macbook Pro, I would think that even the GMA950 or whatever it is would bring performance approximately that of the CPU in a Macbook. Which of course would result in dramatic speed increases, once combined with the CPU.

Another thought, this probably won't happen, because it would take much too much work, and won't be valid for new technology, but with a bit of software trickery, it should be possible to make single precision floating point operation units fake double precision by simply doing multiple calculations, and getting the CPU to stitch it all together at the end. I mean, us humans do it all the time with long division and multiplication. Wouldn't work with nonlinear functions though, so no exponentials. :p

commander.data
Aug 31, 2009, 01:04 AM
You need to be talking to Apple, not AMD. The Radeon HD 2000 series had the hardware necessary for this kind of stuff. As did the 3000 series. Apple have decided not to support them for one reason or another.

If you look on the compatibility page on ATI's site, you'll see in the footnotes that the 2600 series does not support double precision floating point operations...

Now if you're Apple, and you're encouraging people to use this technology, are you going to potentially support something that isn't going to give the exactly same results for calculations that you'd get from the CPU? - NO.

For example:
(dp FP vs sp FP)
4.546677E10 != 4.566E10

If you're doing scientific calculations, and you're expecting double precision and the app is giving you back single precision because you ran it on the GPU instead of the CPU, you're going to be pissed.
I asked AMD's Stream Computing team about the possibility of HD2000 and HD3000 series support for OpenCL when the original Snow Leopard specs for OpenCL support came out and they said that OpenCL HD2000 and HD3000 do not and will not support OpenCL due to hardware limitations. They didn't say what the limitation is, but I'm almost certain it has nothing to do with double precision floating point support. For one thing, the nVidia 8000, 9000, and GT100 series don't support DP floats either. Only the ATI HD4000 and nVidia GTX200 series do, which was why I was hoping Apple would go with the HD4670 as the mid-range GPU in the last refresh instead of the 9600M GT and GT100 series. What's more, I'm pretty sure the current OpenCL 1.0 only defines single precision floats and double precision floats are currently an option.

The more likely reason why the HD2000 and HD3000 series don't support OpenCL is that their memory structure is different. nVidia DX10 GPUs define local memory stores in each Streaming Multiprocessor (SM) allowing groups of 8 Stream Processors (SPs) to share data. If I'm not mistaken, the HD2000 and HD3000 series don't have this local memory store between small groups of SP and instead had a data store share by all SPs, which I guess may be more inefficient. OpenCL is reportedly closer to nVidia's CUDA than ATI's CTM so it doesn't surprise my if OpenCL's memory model is closer to nVidia's GPU structure. In any case, ATI seems to agree that this is the way to go, since the HD4000 series has local memory stores between groups of 16 (5-way) SPs. This would explain why the HD4000 series is OpenCL compatible.

As well, the HD4000 series still being slower than nVidia GPUs in OpenCL doesn't surprise me. This has been the case in Folding@home, even when the ATI GPUs are running their own native CTM code. nVidia seems to have spent more effort for designing their GPUs for GPGPU operations since they are increasingly promoting them in competition to CPUs, which nVidia lacks, whereas AMD already makes CPUs so they don't have the same motivation. The 9600M GT being faster than the GT120 is likely, since the GT120 is basically a rebranded 9500GT which is a budget GPU. Apple's $150 price for the GT120 is about 3 times more than the PC version is worth.

On another note, if the 9400M and 9600M GT can both do OpenCL in parallel, which I was hoping for, can the 9600M GT do OpenGL graphics while the 9400M do OpenCL physics? That'd be great for games, since you'd get more realistic physics without sacrificing anything except power and heat since the 9400M would have been doing nothing anyways. This type of parallel GPU usage would be more worthwhile for Apple to focus on since it's more flexible than SLI. Similarly, a Mac Pro could use say a HD4870 for graphics and a second GT120 for physics.

EDIT: I remembered the HD3000 supports double precision floats as well (like the HD4000 and GTX200), but again that doesn't appear to be the reason for the lack of OpenCL support.

lilyyin99
Aug 31, 2009, 01:16 AM
Interesting that the CPU beats the 4870 on the Pro!

iAlex
Aug 31, 2009, 01:17 AM
I'm holding out for Windows 7.................... HAHAHAHA NOT!!!!!!!

netkas
Aug 31, 2009, 01:32 AM
I asked AMD's Stream Computing team about the possibility of HD2000 and HD3000 series support for OpenCL when the original Snow Leopard specs for OpenCL support came out and they said that OpenCL HD2000 and HD3000 do not and will not support OpenCL due to hardware limitations. They didn't say what the limitation is, but I'm almost certain it has nothing to do with double precision floating point support. For one thing, the nVidia 8000, 9000, and GT100 series don't support DP floats either. Only the ATI HD4000 and nVidia GTX200 series do, which was why I was hoping Apple would go with the HD4670 as the mid-range GPU in the last refresh instead of the 9600M GT and GT100 series. What's more, I'm pretty sure the current OpenCL 1.0 only defines single precision floats and double precision floats are currently an option.

The more likely reason why the HD2000 and HD3000 series don't support OpenCL is that their memory structure is different. nVidia DX10 GPUs define local memory stores in each Streaming Multiprocessor (SM) allowing groups of 8 Stream Processors (SPs) to share data. If I'm not mistaken, the HD2000 and HD3000 series don't have this local memory store between small groups of SP and instead had a data store share by all SPs, which I guess may be more inefficient. OpenCL is reportedly closer to nVidia's CUDA than ATI's CTM so it doesn't surprise my if OpenCL's memory model is closer to nVidia's GPU structure. In any case, ATI seems to agree that this is the way to go, since the HD4000 series has local memory stores between groups of 16 (5-way) SPs. This would explain why the HD4000 series is OpenCL compatible.

As well, the HD4000 series still being slower than nVidia GPUs in OpenCL doesn't surprise me. This has been the case in Folding@home, even when the ATI GPUs are running their own native CTM code. nVidia seems to have spent more effort for designing their GPUs for GPGPU operations since they are increasingly promoting them in competition to CPUs, which nVidia lacks, whereas AMD already makes CPUs so they don't have the same motivation. The 9600M GT being faster than the GT120 is likely, since the GT120 is basically a rebranded 9500GT which is a budget GPU. Apple's $150 price for the GT120 is about 3 times more than the PC version is worth.

On another note, if the 9400M and 9600M GT can both do OpenCL in parallel, which I was hoping for, can the 9600M GT do OpenGL graphics while the 9400M do OpenCL physics? That'd be great for games, since you'd get more realistic physics without sacrificing anything except power and heat since the 9400M would have been doing nothing anyways. This type of parallel GPU usage would be more worthwhile for Apple to focus on since it's more flexible than SLI. Similarly, a Mac Pro could use say a HD4870 for graphics and a second GT120 for physics.

EDIT: I remembered the HD3000 supports double precision floats as well (like the HD4000 and GTX200), but again that doesn't appear to be the reason for the lack of OpenCL support.


and the answer is..... compute shaders

http://forums.amd.com/forum/messageview.cfm?FTVAR_FORUMVIEWTMP=Linear&catid=328&threadid=116102

>Pixel Shader code (if not using some special stuff like double precision) runs on all cards. Compute shaders only on the HD4000 series.


Apple could make opencl for older radeons via Pixel shaders and glsl, but they didnt want to

awulf
Aug 31, 2009, 01:36 AM
Here's the result from mine (using a Nvidia GTX 275):


...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GTX 275
Device 0 is an: GPU with max. 1460 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.333 seconds

OpenCL Device # 1 = Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
Device 1 is an: CPU with max. 2700 MHz and 8 units/cores
Now computing - please be patient....
time used: 4.297 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

John.B
Aug 31, 2009, 01:50 AM
Obviously by "badly" I mean slowly, not wrong. And considering the 5 fold speed increase (in this crappy benchmark, granted) between the GPU and CPU in a Macbook Pro, I would think that even the GMA950 or whatever it is would bring performance approximately that of the CPU in a Macbook.
No. The GMA950 really is that craptacular.

There is a reason that Apple went to all the trouble to break with Intel and its integrated graphics (if you can call it that) chip and finally transitioned to Nvidia graphics instead.

DAMNiatx
Aug 31, 2009, 01:52 AM
That's not true on my machine, I just had to log in and out both times when switching graphics chips.


we are not switching, we use BOTH graphics chips. :rolleyes:

Erasmus
Aug 31, 2009, 01:54 AM
No. The GMA950 really is that craptacular.

There is a reason that Apple went to all the trouble to break with Intel and its integrated graphics (if you can call it that) chip and finally transitioned to Nvidia graphics instead.

Personally, I don't believe that the GMA950 is bad enough to not bring noticeable speed gains over just the CPU using OpenCL. 10% is better than nothing. Unfortunately, as it is not supported, we will never know.

John.B
Aug 31, 2009, 02:02 AM
Personally, I don't believe that the GMA950 is bad enough to not bring noticeable speed gains over just the CPU using OpenCL. 10% is better than nothing. Unfortunately, as it is not supported, we will never know.
10% (assuming even that) is not better than nothing. I would rather see them devote effort to supporting newer graphics cards that will provide a more substantial boost than 2% or 5% or 10%. It's even possible the overhead could make the entire operation slower.

The Intel shared memory on-board graphics were once at least competitive, but they got lazy and the graphics chip industry passed them by. Even the newer GMA X3100 (which I have in my Santa Rosa blackbook) would be woefully inadequate for this type of operation, and that's nobody's fault but Intel's.

JFreak
Aug 31, 2009, 02:03 AM
Personally, I don't believe that the GMA950 is bad enough to not bring noticeable speed gains over just the CPU using OpenCL. 10% is better than nothing. Unfortunately, as it is not supported, we will never know.

I believe total performance would likely go DOWN if the GMA950 was used. Remember that it also takes time to arrange things for processing in multiple units so scheduling for that crap can easily cost more than the payload --> what's the point? Plus, if the ATI HD2xxx series hardware is not capable for OpenCL, then how on earth would this integrated crap be better?

PurpleLogix
Aug 31, 2009, 02:05 AM
This is with my FLASHED 4870 1gb in a 2006 MP

Number of OpenCL devices found: 2
OpenCL Device # 0 = Radeon HD 4870
Device 0 is an: GPU with max. 750 MHz and 4 units/cores
Now computing - please be patient....
time used: 4.066 seconds

Drivers need to be updated, only 4 cores!!!!

mrtrilby
Aug 31, 2009, 02:11 AM
On my late 2008 unibody MBP, when it runs the test on the 9400 GPU the mouse becomes pretty unresponsive. I assumed this was because the 9400 can't handle updating the screen at the same time as handling other computer tasks.

However, when I change energy saver to "higher performance" to bring in the 9600 GPU, I assume that the graphics are now being handled by the 9600. However, when it runs the test on the 9400 GPU, the mouse still goes unresponsive and other screen updates pause / become less frequent. When it moves on to run the test on the 9600 GPU, the screen updates fine.

Does that imply that my slower 9400 GPU is handling screen updating all of the time regardless of energy saver settings? It would explain why I see no difference between the two settings other than how hot the machine runs and battery life. Has anyone else with a unibody MBP noticed this?

Evangelion
Aug 31, 2009, 02:15 AM
On my late 2008 unibody MBP, when it runs the test on the 9400 GPU the mouse becomes pretty unresponsive. I assumed this was because the 9400 can't handle updating the screen at the same time as handling other computer tasks.

However, when I change energy saver to "higher performance" to bring in the 9600 GPU, I assume that the graphics are now being handled by the 9600.

You would assume wrong. Changing the active GPU which handles graphics requires you to log out and back in again.

mixel
Aug 31, 2009, 02:22 AM
Could these performance increases be leveraged by virtualisation apps? I suspect not; i just thought it'd be funny being able to have a big performance increase in Windows (etc) that isn't present when they run natively.

Kaptajn Haddock
Aug 31, 2009, 02:42 AM
It's ridiculous that my one year old iMac with Radeon HD 2600 Pro is not supported.

Pobbit
Aug 31, 2009, 02:44 AM
OK. I'm not a technical person :eek:, but technologies like OpenCL and GC interest me. Actually, anything that promises to speed up my system interests me.

Now for a real world question...

I have a lot of bookmarks in Firefox. When I click on bookmarks on the menu, it takes about a second to drop down. And yes, I'll probably be switching to Safari. But if I were to continue to use Firefox, would this sort of thing speed up? Will the overall system interface become more fluid? Will the pretty icons in the dock change size faster and more smoothly?

I know it may sound silly, but one of the reasons I was so impressed by Macs was how smoothly the interface operates. That's what I'm going to be using almost every second I use my computer, so that's what I'm most interested it.

Thanks.

depulse
Aug 31, 2009, 03:23 AM
When will Apple release an update for Logic that allows Logic to offload some of the effects and synth plugins to the GPU?

No need for dedicated DSP solutions anymore (except for being used as a hardware dongle). UAudio on a Macbook Pro without the Expresscard.

chaosbunny
Aug 31, 2009, 03:25 AM
I too have a HD2600 iMac, bought 09/2007 and I don't believe my iMac to be obsolete just because ONE single feature of 10.6 is not supported. My machine is still plenty fast for my tasks (mostly graphic design). I can still install Snow Leopard, which even without OpenCL, should make the system a bit faster.

We don't even know OpenCLs full potential yet because there are no usefull apps out there that use it. I'm not mad because I can't run some benchmarking tool. :rolleyes:

The fact that some systems may get speed gains in some apps in the future won't make my system run slower. And I'll happily continue to use it for the next 2-3 years for Adobe CS4(5/6 sometime), Final Cut Express, Cinema 4D, ...

jglavin
Aug 31, 2009, 03:26 AM
Does that imply that my slower 9400 GPU is handling screen updating all of the time regardless of energy saver settings? It would explain why I see no difference between the two settings other than how hot the machine runs and battery life. Has anyone else with a unibody MBP noticed this?Not sure about why the benchmark is causing these effects, and I did notice the same type of lag on my 2008 uMBP; however, I am sure that there is a noticeable difference in performance when running a game such as X-Plane on the 9400 vs. 9600GT. It is able to render much more detail and at a higher framerate on the 9600.

Erasmus
Aug 31, 2009, 03:56 AM
Oops! Seems I forgot how much you guys hate the GMA950, and Intel graphics in general! Must be because I don't have one. :p

*LTD*
Aug 31, 2009, 04:00 AM
FWIW, my results. Early 2008 MBP.


Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 3.084 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 16.009 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

mixel
Aug 31, 2009, 04:07 AM
Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GS
Device 0 is an: GPU with max. 1250 MHz and 64 units/cores
Now computing - please be patient....
time used: 0.942 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU E8235 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.031 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

- My 2008 iMac.. I'm surprised how well my this machine is measuring up against much more expensive machines.. ?!

After G
Aug 31, 2009, 04:07 AM
Oops! Seems I forgot how much you guys hate the GMA950, and Intel graphics in general! Must be because I don't have one. :pNo kidding, the GPU test quits after being unable to get the GPU string ...
And the CPU test fails with wrong processor type...

(Core Duo, GMA 950 Mac Mini) :p

CarlosG
Aug 31, 2009, 04:48 AM
I'm pleased that I waited for Apple to finally ship my Mac Pro with the 8800GT.

Bjohnson33
Aug 31, 2009, 05:22 AM
Wow, this is incredible.
This means small, inexpensive Laptops with SL installed can suddenly beat big,fat MacPros without SL.

It will be interesting to see what comes about in terms of speed increases over the next 12 months. It sounds like Apple has laid a lot of groundwork - if the developers manage to harness that power, we could be getting some pretty amazing stuff.

mrtrilby
Aug 31, 2009, 05:56 AM
You would assume wrong. Changing the active GPU which handles graphics requires you to log out and back in again.

I realise that I need to log out and back in again to switch graphics adaptors. The point still stands though - screen updates are a bit screwy when the 9400 GPU is working on computation tasks, regardless of which GPU is supposed to be up handling screen updates. It's a relief to see that someone else sees the same thing.

dernhelm
Aug 31, 2009, 06:18 AM
I'm a little bummed that my ATI x1600 isn't supportable. I would really like to play with this feature some. I've been working with GCD this weekend, and it is really awesome.

I know I can do OpenCL programming with just my CPU, but it isn't the same.

Definitely looking harder at a Macbook Pro for my next computer, if OpenCL can use both GPUs at once. While it won't make a difference to 97% of the programs out there, the other 3% interests me A LOT!

:D

Neodym
Aug 31, 2009, 06:44 AM
Wow, this is incredible.
This means small, inexpensive Laptops with SL installed can suddenly beat big,fat MacPros without SL.

... and then the big, fat Mac Pro gets fully loaded with nVidia graphic cards and updated to SL and it will run circles around even expensive Laptops!

commander.data
Aug 31, 2009, 06:54 AM
and the answer is..... compute shaders

http://forums.amd.com/forum/messageview.cfm?FTVAR_FORUMVIEWTMP=Linear&catid=328&threadid=116102

>Pixel Shader code (if not using some special stuff like double precision) runs on all cards. Compute shaders only on the HD4000 series.


Apple could make opencl for older radeons via Pixel shaders and glsl, but they didnt want to
I knew that the original Brook compiled to Pixel Shaders allowing GPGPU work on DX9 level GPUs but I didn't know Brook+ still does it with DX10 GPUs. In any case, ATI being able to compile their own Brook+ language into Pixel Shaders doesn't necessarily mean that OpenCL could work. Especially if the problem is incompatible memory structure in the HD2000 and HD3000 series. Plus, if the HD4000 series doesn't have great performance in OpenCL when it's able to talk natively to the GPU, I doubt the performance of OpenCL emulated in Pixel Shaders on older generation GPUs would be stellar so it probably wouldn't be worthwhile even if it were technically possible.

Gray-Wolf
Aug 31, 2009, 07:05 AM
Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
Device 0 is an: CPU with max. 2200 MHz and 2 units/cores
Now computing - please be patient....
time used: 16.994 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Not sure what it means for me.

PinkyMacGodess
Aug 31, 2009, 07:10 AM
Agreed. This is pretty much a worthless benchmark. There's nothing complex taking place, no hard memory thrashing, no difficult calculations.

It's just taking 2 arrays with 5000 numbers in them and adding them together into a new array of 5000 numbers. Simple atomic add operations of 2 numbers over and over and over. The more cores you have the more you can split the array up (4 cores = each core processes 1250 items, 32 cores = each core processes 156.25 items.) and the faster clock means that each item gets processed faster.

This is NOT AT ALL indicative of a real OpenCL app that will be doing hundreds of thousands of difficult computations, with dependancies between the dat and working across huge datasets.

I'd say ignore every result we get out of this app. It's not at all indicative of real-world performance in the least.

I was waiting for someone to say this. This 'test' is rather worthless. It does no 'real' testing. It 'stresses' nothing. It 'tests' nothing but floating point performance. That is perhaps an 'indication' of future performance but not direct and meaningful 'proof' that one board would be faster than another.

It's a simplistic 'test' that misses the whole performance potential of the card and the ancillary services that make up the card.

IE: You could have a card with a screaming GPU and a very slow memory bus that couldn't be tested by this program resulting in extreme tests and yet crappy real world performance.

A truer test, if possible, would be for the program to have the card display a complex graphic, using many colours and textures/polygons, etc and then time the production of the result. But then you still can't get away from the principle that just by measuring the production of the test, you are also influencing the result to some degree.

A test only tests what it was designed to test. Nothing more. These numbers are meaningful on their own but really don't prove anything except for potential. Nothing tests better than the real world...

commander.data
Aug 31, 2009, 07:22 AM
I was waiting for someone to say this. This 'test' is rather worthless. It does no 'real' testing. It 'stresses' nothing. It 'tests' nothing but floating point performance. That is perhaps an 'indication' of future performance but not direct and meaningful 'proof' that one board would be faster than another.

It's a simplistic 'test' that misses the whole performance potential of the card and the ancillary services that make up the card.

IE: You could have a card with a screaming GPU and a very slow memory bus that couldn't be tested by this program resulting in extreme tests and yet crappy real world performance.

A truer test, if possible, would be for the program to have the card display a complex graphic, using many colours and textures/polygons, etc and then time the production of the result. But then you still can't get away from the principle that just by measuring the production of the test, you are also influencing the result to some degree.

A test only tests what it was designed to test. Nothing more. These numbers are meaningful on their own but really don't prove anything except for potential. Nothing tests better than the real world...
I'm assuming a real world application would also not use the GPU in isolation so it isn't really a GPU vs CPU competition. The GPU and CPU would probably have to work together, passing data back and forth. As you mentioned, it is also important to see how well the OpenGL and OpenCL pipelines work together for visualization applications. I think it'd also be interesting to see how smart the OpenCL scheduler is if there are multiple OpenCL applications and say Core Image/Animation applications requesting GPU time.

AtlasBoy
Aug 31, 2009, 07:24 AM
Did you not even bother to read the article? It stated clearly that only 1 GPU is active at a time. With the unibody MBP's only 1 GPU can be accessed at a time, and you have to log out to switch between them. I don't get why people think that wouldn't be the case with OpenCL. The OS only sees 1 of the GPUs at a time.

Ummm - maybe you need to read the artical again. It says

"Most interesting is that for owners of high end MacBook Pros which contain both 9400M and 9600M GT graphics cards, both GPUs can be used at any time by OpenCL. In contrast, both of these GPUs can not be used for general graphics processing and requires a Mac OS X logout to switch from one to another.

Shawn Parr
Aug 31, 2009, 07:28 AM
we are not switching, we use BOTH graphics chips. :rolleyes:

I guess I don't see what you are implying here. The OpenCL test can use both graphics chips when the machine is set to 'Higher Performance.'

However if you are set to 'Better Battery Life' then only the 9400 is available, and neither the graphics nor OpenCL can use the 9600, thus only using one GPU, not both.

In order to switch between 'Higher Performance' and 'Better Battery Life' you must log out and back in.

Personally I always run my machine in 'Better Battery Life' since I don't need the performance on the 9600 for graphics. And I think the 9600 is unavailable in this mode, even for OpenCL, for the obvious reason that it sucks more power. However I'd love to see an option to allow OpenCL to see/use the 9600 when in 'Better Battery Life' when the machine is plugged in, so that in the rare tasks that use OpenCL you could get that extra boost.

PinkyMacGodess
Aug 31, 2009, 07:32 AM
you know the event on the 9th is just a ipod event.

Oh yeah. I forgot. They introduced the iPhone 3GS with a bunch of other stuff. Call the Keystone Cops (the FTC)...

What could be more complementary for a screaming fast iPod that will make pancakes and french toast than a ripping fast iMac with either a Core I7 or a Quad Core processor! (Or Xeon?)

PinkyMacGodess
Aug 31, 2009, 07:34 AM
I'm assuming a real world application would also not use the GPU in isolation so it isn't really a GPU vs CPU competition. The GPU and CPU would probably have to work together, passing data back and forth. As you mentioned, it is also important to see how well the OpenGL and OpenCL pipelines work together for visualization applications. I think it'd also be interesting to see how smart the OpenCL scheduler is if there are multiple OpenCL applications and say Core Image/Animation applications requesting GPU time.

Like a runner on a treadmill. Yeah, they are fast for a 5 minute 'run'. Put them on an outdoor 5k? In Denver? Well, that's different, ain't it...

newfoundglory
Aug 31, 2009, 07:36 AM
I get really odd results running this OpenCL thing on my Early 2008 MBP. The first time I ran it... it look more than 14 seconds on the GPU... which is an 8600M GT

The second time I ran it I got under 3 secs:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.931 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.104 seconds


... but sometimes when I run it takes more than 4 seconds on GPU...

ibosie
Aug 31, 2009, 07:49 AM
Mac Pro 3.2Ghz Early 2008

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.686 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU X5482 @ 3.20GHz
Device 1 is an: CPU with max. 3200 MHz and 8 units/cores
Now computing - please be patient....
time used: 2.848 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

filmweaver
Aug 31, 2009, 07:51 AM
Here is mine:
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 1
OpenCL Device # 0 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 0 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.248 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

dwdrums
Aug 31, 2009, 07:56 AM
aside from the fact that this benchmark is bogus...

Aren't all of you uMBP owners completely ********** ecstatic that both GPUs can run OpenCL at the same time? This is amazing.

:):):):)

lssmit02
Aug 31, 2009, 08:02 AM
The 8800 scored a 0.760. Nice!

wizard
Aug 31, 2009, 08:04 AM
Agreed. This is pretty much a worthless benchmark. There's nothing complex taking place, no hard memory thrashing, no difficult calculations.

It's just taking 2 arrays with 5000 numbers in them and adding them together into a new array of 5000 numbers. Simple atomic add operations of 2 numbers over and over and over. The more cores you have the more you can split the array up (4 cores = each core processes 1250 items, 32 cores = each core processes 156.25 items.) and the faster clock means that each item gets processed faster.

This is NOT AT ALL indicative of a real OpenCL app that will be doing hundreds of thousands of difficult computations, with dependancies between the dat and working across huge datasets.

I'd say ignore every result we get out of this app. It's not at all indicative of real-world performance in the least.

Repeating; people should read the above closely. This teast is of limited value.

I'm interested though in info related to what GPU improvements Apple has made in Quartz. The interface is so much better I have to believe that Apple has expanded GPU use. The question is this; is that via OpenCL or through Quartz's traditional acceleration.


Dave

MrENGLISH
Aug 31, 2009, 08:14 AM
my 3.06GHz iMac with 8800 GS results:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GS
Device 0 is an: GPU with max. 1250 MHz and 64 units/cores
Now computing - please be patient....
time used: 0.927 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU E8435 @ 3.06GHz
Device 1 is an: CPU with max. 3060 MHz and 2 units/cores
Now computing - please be patient....
time used: 12.255 seconds

*LTD*
Aug 31, 2009, 08:19 AM
I say we just turn this into a massive epeen-fest.

Skycutter
Aug 31, 2009, 08:24 AM
When will Apple release an update for Logic that allows Logic to offload some of the effects and synth plugins to the GPU?

No need for dedicated DSP solutions anymore (except for being used as a hardware dongle). UAudio on a Macbook Pro without the Expresscard.

Im Wondering this too...

But will all our au plugs need to be updated to?
Or just logic?

belltree
Aug 31, 2009, 08:39 AM
Interesting stuff here. Some points to note:

1. When will Apple release OpenCL supporting versions of iLife suite and how dramatic will the improvments be (ie: iMovie encode times, etc).
2. What effect will taxing the GPU + CPU have in terms of internal temperatures for all the various models (iMac, Mini, MacBook Pros). I wouldn't worry much about the Mac Pros. Also how much more of a drain will it be on the notebook batteries when not plugged in?
3. How difficult/easy will it be for developers to release new versions of their apps that can take advantage of this?
4. How about Grand Central dispatch, when will we see apps taking full advantage of multiple cores?
5. Will we start seeing some more movement in the games area as a result of this?

The next year or so should be interesting! :)

aardwolf
Aug 31, 2009, 08:54 AM
This is a single cpu, quad-core Nehalem Xeon (2.66Ghz), with the standard GeForce GT 120 (512MB) video card.

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1400 MHz and 32 units/cores
Now computing - please be patient....
time used: 1.701 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3520 @ 2.67GHz
Device 1 is an: CPU with max. 2659 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.921 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

*LTD*
Aug 31, 2009, 09:13 AM
This is a single cpu, quad-core Nehalem Xeon (2.66Ghz), with the standard GeForce GT 120 (512MB) video card.

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce GT 120
Device 0 is an: GPU with max. 1400 MHz and 32 units/cores
Now computing - please be patient....
time used: 1.701 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU W3520 @ 2.67GHz
Device 1 is an: CPU with max. 2659 MHz and 8 units/cores
Now computing - please be patient....
time used: 1.921 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout


You have no worries.

Safe to say it blew away my early '08 15-inch 2.4ghz MBP with 4gb RAM. ;)

aardwolf
Aug 31, 2009, 09:20 AM
You have no worries.

Safe to say it blew away my early '08 15-inch 2.4ghz MBP with 4gb RAM. ;)

I only wish I had one at home... I'm lucky I could convince them to buy this for me at work, considering most of my job involves using Visual Studio (which I have to run in Windows via VMWare Fusion.) At least Fusion runs fast. :-)

Michael73
Aug 31, 2009, 09:39 AM
Mac Pro 2.8Ghz Early 2008

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.706 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU X5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.246 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)

Now, based on the crazy/insane times people are reporting for the GTX 285 whether it makes sense to upgrade?

Also, I know everyone is going gaga over the fact that OpenCL on the uMBP can make use of BOTH GPUs but if I get a GTX 285 and put it in my MP can it make use of BOTH GPUs in that machine too?

eMagine
Aug 31, 2009, 09:46 AM
results from my 2nd Mac Pro - Mac Pro 3,1

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8800 GT
Device 0 is an: GPU with max. 1500 MHz and 112 units/cores
Now computing - please be patient....
time used: 0.697 seconds

OpenCL Device # 1 = Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
Device 1 is an: CPU with max. 2800 MHz and 8 units/cores
Now computing - please be patient....
time used: 3.208 seconds

Stngray
Aug 31, 2009, 09:46 AM
Looks like my Hackintosh fares pretty well.
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce GTX 280
Device 0 is an: GPU with max. 1350 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.315 seconds

OpenCL Device # 1 = GeForce GTX 280
Device 1 is an: GPU with max. 1296 MHz and 240 units/cores
Now computing - please be patient....
time used: 0.309 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 4 units/cores
Now computing - please be patient....
time used: 8.727 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

davelo
Aug 31, 2009, 09:52 AM
A couple of issues:

(1) several posts have noted the synthetic nature of the benchmark, but unless I'm missing something, even on those terms it seems to be comparing the wrong things.

Instead of comparing just OpenCL performance on GPU versus CPU, shouldn't the comparison also include NON-OpenCL on CPU, i.e., static compiled code, timings?

Otherwise the "leave on CPU" option is being slowed by OpenCL overheads it doesn't need to have?

(2) several posts have noted that Macs with dual GPUs have up till now had to choose one or the other for OpenGL, but can use either for OpenCL, and have speculated about the benefits of using both, say one for OpenGL graphics and another for physics.

Even if the setup/take down overheads of OpenCL don't preclude real-time interactions of this kind, these days GPUs often run hotter than CPUs, so running both in a machine which was designed only to run one or the other may not be a thermally sound option — that is to say, you might fry the motherboard.

I'd like to see a statement from Apple on this for the affected machines, and just point out gently that if there is a problem here, the fabled "MacMidiTower" would fix it...

(3) several posts imply that their makers are seriously considering changing their hardware on the basis of this test. If so, you have a "John McEnroe" issue: you *cannot* be serious ;)

longofest
Aug 31, 2009, 09:59 AM
I'd love to see a benchmark of someone who has a Quadro graphics card...

Digitalclips
Aug 31, 2009, 10:01 AM
OMFG... WHY do people get so pissed about this?

Seriously - What does your MB NOT do today that you could do with it yesterday? If OpenCL is your ONLY issue, then I really gotta ask, what programs are you running that you think OpenCL is going to buy you some huge performance improvement with? OpenCL is not a "flip a switch and everything is faster" technology! It's not like a Turbo charger than you bolt onto an engine and suddenly the whole car is faster.

Heck, I'm betting there won't even be any decent OpenCL apps out for a good 6 months at least. By that time, your "slightly more than a year old MacBook" will be around 2 years old... Most people I know don't even keep a laptop more than 3 years because they become so outdated in that time.

By the time you'd see any sort of improvements in the apps you run from OpenCL you're going to be due for a new machine anyhow!

CHILL. BREATHE. You didn't get screwed. You have a perfectly good, useful, viable Macbook that you've already gotten a year's worth of use out of, and it will continue to give you another couple years of use...

Well said! Look at me, am I upset my 10 year old iBook G4 is left out? Na! LOL

As a FCPro user for work, I am drooling about my MacPro though! I used to use two graphics cards for two ACD 30s but found FCPro didn't like working across two cards so took one out and shared one GT8800, however I wonder what happens to OpenCL if there were two? Do they even have to have a monitor attached to be used by OpenCL?

Michael73
Aug 31, 2009, 10:07 AM
Well said! Look at me, am I upset my 10 year old iBook G4 is left out? Na! LOL

As a FCPro user for work, I am drooling about my MacPro though! I used to use two graphics cards for two ACD 30s but found FCPro didn't like working across two cards so took one out and shared one GT8800, however I wonder what happens to OpenCL if there were two? Do they even have to have a monitor attached to be used by OpenCL?


That's exactly what I was asking above. If I don't need a second monitor, I see a compelling case to get a GTX 285 and throw it in the machine for the added performance. That said, there's not too many applications on the market that can take advantage of OpenCL yet...

MacFly123
Aug 31, 2009, 10:10 AM
This is AWESOME! :D It's like I just trippled the number of processors in my MacBook Pro! :D

P.S. Does anyone know if the new Final Cut Studio is all optimised for Open CL and Grand Central? I don't remember hearing anything about that when it just came out. Also, was it really all rewritten in Cocoa now? :confused:

Rorikynn
Aug 31, 2009, 10:34 AM
Give it six months or so for software developers to sink their teeth into OpenCL and see how it fits into their applications. Since OpenCL can dynamically run things on either a CPU or GPU (meaning one code base to rule them all) I can't wait for projects like x264, ffmpeg, handbrake, etc, to incorporate OpenCL into their projects. I don't know about all the aspects of encoding but they seem like a perfect fit for each other (data parallelism, SIMD, vector operations on matrices). Suddenly MBPs with 9600M GTs will be performing like Mac Pros before 10.6 & OpenCL came along (with regards to encoding video and audio).

Digitalclips
Aug 31, 2009, 10:35 AM
That's exactly what I was asking above. If I don't need a second monitor, I see a compelling case to get a GTX 285 and throw it in the machine for the added performance. That said, there's not too many applications on the market that can take advantage of OpenCL yet...

Sorry didn't see your post. I have to think Apple will optimize the Pro apps ASAP! I hope someone can answer the question of utilizing additional cards on a Mac Pro. I suspect it may be a different thing from the either or on a MacBook Pro.

doctoree
Aug 31, 2009, 10:44 AM
2. What effect will taxing the GPU + CPU have in terms of internal temperatures for all the various models (iMac, Mini, MacBook Pros). I wouldn't worry much about the Mac Pros. Also how much more of a drain will it be on the notebook batteries when not plugged in?
3. How difficult/easy will it be for developers to release new versions of their apps that can take advantage of this?
4. How about Grand Central dispatch, when will we see apps taking full advantage of multiple cores?
5. Will we start seeing some more movement in the games area as a result of this?


2. No real negative effect. The temps and fan speeds will be comparable to high end gaming. The hardware is built to deal with this.
3. It was stated to be quite easy because it is hardware independent and easy to code but only Apps that are working hughly in parallel threads can take full advantage of the GPU. Apps with a lot of syncking and dependencies in their threads will remain CPU apps for the forseeable future.
4. Probably sooner than OpenCL but the advantages will be MUCH tinier.
5. ID Software is using Open CL in Rage so YES but I doubt the benefits will be HUGE because the GPU is already working hard in Games nowadays.

So Please A Companies (Adobe, Apple, Autodesk) give us full Open CL apps!

JFreak
Aug 31, 2009, 10:52 AM
This is AWESOME! :D It's like I just trippled the number of processors in my MacBook Pro! :D

No... more like :apple: tripled the [count of actively used] processors in your MBP ;)

Does anyone know if the new Final Cut Studio is all optimised for Open CL and Grand Central?

Probably 3ish years, it needs to be written in Cocoa first, and while doing so taking advantage of the new API's...

JFreak
Aug 31, 2009, 10:53 AM
So Please A Companies (Adobe, Apple, Autodesk) give us full Open CL apps!

+1

randyhudson
Aug 31, 2009, 11:25 AM
Am I the only one wondering why an ATI 4870 reports only 4 "units" instead of 800?

Digitalclips
Aug 31, 2009, 12:40 PM
2. No real negative effect. The temps and fan speeds will be comparable to high end gaming. The hardware is built to deal with this.
3. It was stated to be quite easy because it is hardware independent and easy to code but only Apps that are working hughly in parallel threads can take full advantage of the GPU. Apps with a lot of syncking and dependencies in their threads will remain CPU apps for the forseeable future.
4. Probably sooner than OpenCL but the advantages will be MUCH tinier.
5. ID Software is using Open CL in Rage so YES but I doubt the benefits will be HUGE because the GPU is already working hard in Games nowadays.

So Please A Companies (Adobe, Apple, Autodesk) give us full Open CL apps!

Are any current apps in the pro area gaining speed from Snow Leopard at the moment or are we waiting for updates.

JFreak
Aug 31, 2009, 01:01 PM
Are any current apps in the pro area gaining speed from Snow Leopard at the moment or are we waiting for updates.

Overall snappiness, yes, taking advantage of new tech, no.

clancemasterj
Aug 31, 2009, 01:12 PM
MBP 4,1 2.4Ghz 15"

...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.369 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T8300 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.873 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)


Why did the 8600M GT beat the unibody's 9600M GT?

CQd44
Aug 31, 2009, 01:30 PM
Why did the 8600M GT beat the unibody's 9600M GT?

If it makes you feel better, a 9600 is little more than a rebranded 8600.

MacFly123
Aug 31, 2009, 01:57 PM
Probably 3ish years, it needs to be written in Cocoa first, and while doing so taking advantage of the new API's...

I thought that is what they did to FCS in this new version and that was why it took so long. So that didn't happen? :( If that is true, I'm even more underwhelmed at the latest update then! :(

fleshman03
Aug 31, 2009, 03:21 PM
If it makes you feel better, a 9600 is little more than a rebranded 8600.

Ah yes. But does the 9600 have the feature of bursting into flames? +1 for the 8600.

Here's my benchmarks.


...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 940 MHz and 32 units/cores
Now computing - please be patient....
time used: 3.056 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
Device 1 is an: CPU with max. 2500 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.710 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

TripHop
Aug 31, 2009, 03:30 PM
Early 2008 17" MacBook Pro4,1 HD C2D T9500 Penryn @ 2.6GHz GeForce 8600M GT

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.372 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T9500 @ 2.60GHz
Device 1 is an: CPU with max. 2600 MHz and 2 units/cores
Now computing - please be patient....
time used: 13.999 seconds
:)

deconstruct60
Aug 31, 2009, 03:59 PM
Am I the only one wondering why an ATI 4870 reports only 4 "units" instead of 800?

Because "stream processors" (and in some cases when limit to fix functionality, "pixel shaders" ) are not cores. Think of them more as Altivec/VMX/SEE functional units. Great at math and bit-twiddling tasks on vectors, but only take those specific kind of commands.

http://www.tomshardware.com/forum/252674-33-stream-processors
http://en.wikipedia.org/wiki/Shader_(computer_science)

The newer shaders are "programmable" and are tweaked slightly to be more "general" CPU cores. The streams just digest pixels in much more limited ways ( here are some bits and do things you "already know how to do" to them. ) Besides they are very inflexible they are smaller so can drop 800 of them on the die. (can forgo general registers .. etc . etc. )




The benchmark does more than just add three vectors. It loops over the three vectors adding them. If there is no "if then branch" functionality in the "stream processor" then it is not what is driving the work. Cores have a full complement of commands (logical branch, tests , exceptions, load stores ... in addition to math and twiddling. )


Would be interesting if OpenCL what got reported by a Cell processor package. The Cell processor with its one Power based core and several SPE's. However, SPE's do their own memory load/store and etc on their own. I guess in the OpenCL context would "ignore" the PPC core and just report the SPE. ( since they get different binaries anyway. )

Digitalclips
Aug 31, 2009, 04:11 PM
Overall snappiness, yes, taking advantage of new tech, no.

Thanks for reply. So quite a few updates from Apple must be coming soon, they of all people should have their apps currently being updated to take advantage of both Parallel processing and OpenCL ... I wonder when?

JFreak
Aug 31, 2009, 04:27 PM
So quite a few updates from Apple must be coming soon, they of all people should have their apps currently being updated to take advantage of both Parallel processing and OpenCL ... I wonder when?

We have been waiting for properly written (cocoa) version of FCP for longer than I can remember... don't hold your breath.

Bregalad
Aug 31, 2009, 05:06 PM
Well, not small, inexpensive laptops... (do those even exist?), but my Macbook Pro 2.53 with Snow Leopard renders Final Cut Pro footage faster than my 2006 Mac Pro 2.66 that is running on 10.5.8. Of course, if I add Snow Leopard to my Mac Pro, it should even out or be faster. But yeah, this is a good time to have a laptop- it's going to be even better when the apps are coded to optimize Snow Leopard. I never thought I'd see the day a laptop would run faster than my desktop, but it looks like it's here. I might be able to dump the ol' ball and chain pretty soon. :cool:

Take new technology and then compare new hardware with 3 year old hardware. Express surprise when new hardware wins. woohoo.

Now try the same thing with new hardware and you'll see the desktop wins by a heathy margin. If Apple actually made desktops (instead of iMacs and minis) the margin of victory would be huge.

What this is really illustrating is what a small but vocal minority has been saying about Apple for years: they ship the crappiest GPUs they can get away with. When new technology comes along that can make use of the GPU almost none of their older machines can take advantage of it. Even the Mac Pro from last year ago shipped with junk that's not compatible with OpenCL.

merlintl
Aug 31, 2009, 06:14 PM
Because "stream processors" (and in some cases when limit to fix functionality, "pixel shaders" ) are not cores. Think of them more as Altivec/VMX/SEE functional units. Great at math and bit-twiddling tasks on vectors, but only take those specific kind of commands.



So, does this mean that for complex Open CL operations, the current set of nVidia cards (even low end) will blow away a ATI 4870 since it uses stream processors? That would be a disappointing for Open CL uses. The ATI 4870 card is a good card in general but maybe not for Open CL?

gauchogolfer
Aug 31, 2009, 06:26 PM
If this can be demonstrated reasonably soon in Matlab and Mathematica I might be able to move away from the HP workstation I've got at my desk and get a MP for the heavy lifting. I've been looking at a plugin package for Matlab called Acclereyes that does CUDA code porting for nVidia GPU acceleration, but it's about a $1000 plugin and I'm not sure about its long-term viability/support. This would be a nice way to go if possible.

randyhudson
Aug 31, 2009, 07:52 PM
Because "stream processors" (and in some cases when limit to fix functionality, "pixel shaders" ) are not cores. Think of them more as Altivec/VMX/SEE functional units. Great at math and bit-twiddling tasks on vectors, but only take those specific kind of commands.

Something still doesn't add up. I don't see how an integrated NVidia GPU could have 16 of anything that a high-end discrete GPU from ATI doesn't have more of. ATI has been offloading video encoding to their GPUs since 2004, so this isn't anything new.

It sounds more like a poorly-written driver (or benchmark) that isn't taking advantage of the ATI GPU.

2002cbr600f4i
Aug 31, 2009, 09:46 PM
Something still doesn't add up. I don't see how an integrated NVidia GPU could have 16 of anything that a high-end discrete GPU from ATI doesn't have more of. ATI has been offloading video encoding to their GPUs since 2004, so this isn't anything new.

It sounds more like a poorly-written driver (or benchmark) that isn't taking advantage of the ATI GPU.

The problem is you're comparing Apples to Oranges (no pun intended). NVidia and ATI, while supporting the same DirectX/OpenGL/OpenCL API's, do things VERY differently in hardware. It's not that the NVidia GPU has 16 of the same thing that a high end ATI card has. They are, in fact, very very different animals.

What we can't tell yet, at least from this "benchmark" (and I cringe to even call it that) is just how much better one manufacturer's design is over the other for OpenCL. Certainly it appears that the NVidia chips are better for this sort of task, but this code isn't really realistic enough to prove it.

I think we'll see more OpenCL support and better examples of code that really stresses the cards more.

Also, keep in mind that MS is introducing the concept of "Compute-Shaders" in DirectX 11, which I believe is coming with Windows 7 (or due within the next 12 months). That concept is VERY similar to OpenCL, except, like everything Microsoft, proprietary. So, if ATI really is behind on performance, you can bet that the loads of Windows users complaining about it will force ATI to improve their cards rapidly, and that SHOULD translate into better OpenCL performance as well.

Time will tell folks!

dergaderg
Sep 1, 2009, 12:46 AM
i just tested this with my 3 year old santa rosa mac book pro and i got
...........................................................
.................. OpenCL Bench V 0.25 by mitch ...........
...... C2D 3GHz = 12 sec vs Nvidia 9600GT = 0,93 sec ......
... time results are not comparable to older version! .....
...........................................................

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.370 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
Device 1 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.545 seconds

Now checking if results are valid - please be patient....
:) Validate test passed - GPU results=CPU results :)
logout

[Process completed]

jkleemann
Sep 1, 2009, 01:48 AM
I wondered why ati cards seem to suck so badly so i took the ao-test (basic raytracer) and tried it on my 4870 (on macpro w. 8 cores). The gpu-test failed ("cannot find function") on the ATI card. So i guess the ATI-drivers are still beta or just flawed.

The nice thing about the ao benchmark is that it reads a cl file instead of having the ocl code inside the binary so i could easily change it.

I changed the code to a simple add loop (each kernel loops 10000) adds. my macpro outperformed the gpu.

I changed the code to a more common multiply&add unrolled (10 times) which made the gpu 3 times faster than the 8 cores (@3.0Ghz). So i guess the benchmarks show us 2 things right now:

1.) openCL works
2.) ATI cards need better support (incompatibility between Nvidia and ATI cards). This is also true for the examples (perlin noise,...) which can be downloaded from apple - they just not work on my 4870 card.
3.) for simple add loops the higher number of cores on nvidia cards rule

Sander
Sep 1, 2009, 03:17 AM
I want that GTX 285. Now how do I convince the wife? :D

"Honey, I'm thinking I should spend less time at the computer and more time with you and the kids. Now I think I found a way to make the computer finish the work faster..."

MythicFrost
Sep 1, 2009, 04:28 AM
Could this mean Mac games could use multiple graphics cards? similar to SLI or CrossFire?

Kind Regards

MorphingDragon
Sep 1, 2009, 04:31 AM
Could this mean Mac games could use multiple graphics cards? similar to SLI or CrossFire?

Kind Regards

XFire and SLI is hardware based.

MorphingDragon
Sep 1, 2009, 04:35 AM
It's ridiculous that my one year old iMac with Radeon HD 2600 Pro is not supported.

The Radeon 2XXX arent capable of Double Precision calculations!!! :rolleyes:

DO WE NEED TO SET UP A BLOOMIN RECORDING!? :confused:

Cant wait to learn OpenCL though, but I need an Intel Mac first. :apple:

AussieDSW
Sep 1, 2009, 06:15 AM
This is very cool and interesting. On battery power my late 2008 15" MBP's 9400M beats up on the 9600M. But once plugged in the 9600M trounces the 94, without regard or regret.

Check it:
Battery
OpenCL Device # 0 = GeForce 9600M GT
time used: 13.622 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.022 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
Device 2 is an: CPU with max. 2800 MHz and 2 units/cores
time used: 13.102 seconds

Plugged in

OpenCL Device # 0 = GeForce 9600M GT
time used: 2.788 seconds

OpenCL Device # 1 = GeForce 9400M
time used: 9.028 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz
time used: 13.183 seconds

I got very similar results with my Late 2008 Macbook Pro!

Why does the 9600M GT run at less than 10% of its performance when running on battery?

This seems to mean that it may be actually better performance to use the 9400M over the 9600M GT when running on batteries.

Here's my results:

Battery
Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 13.466 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 8.986 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.484 seconds


Plugged in
Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.804 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.028 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.501 seconds

friede
Sep 1, 2009, 08:51 AM
I got very similar results with my Late 2008 Macbook Pro!

Why does the 9600M GT run at less than 10% of its performance when running on battery?

This seems to mean that it may be actually better performance to use the 9400M over the 9600M GT when running on batteries.

Here's my results:

Battery
Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 13.466 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 8.986 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.484 seconds


Plugged in
Number of OpenCL devices found: 3
OpenCL Device # 0 = GeForce 9600M GT
Device 0 is an: GPU with max. 1250 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.804 seconds

OpenCL Device # 1 = GeForce 9400M
Device 1 is an: GPU with max. 1100 MHz and 16 units/cores
Now computing - please be patient....
time used: 9.028 seconds

OpenCL Device # 2 = Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
Device 2 is an: CPU with max. 2400 MHz and 2 units/cores
Now computing - please be patient....
time used: 15.501 seconds


When my MBP 13" is plugged in, my 9400M only needs 3.5 seconds...

VoR
Sep 1, 2009, 09:17 AM
Agreed. This is pretty much a worthless benchmark. There's nothing complex taking place, no hard memory thrashing, no difficult calculations.

It's just taking 2 arrays with 5000 numbers in them and adding them together into a new array of 5000 numbers. Simple atomic add operations of 2 numbers over and over and over. The more cores you have the more you can split the array up (4 cores = each core processes 1250 items, 32 cores = each core processes 156.25 items.) and the faster clock means that each item gets processed faster.

This is NOT AT ALL indicative of a real OpenCL app that will be doing hundreds of thousands of difficult computations, with dependancies between the dat and working across huge datasets.

I'd say ignore every result we get out of this app. It's not at all indicative of real-world performance in the least.


I found the phoronix.com benchmark (and/or reviews) a lot more interesting than this thread :)

jdiamond
Oct 20, 2009, 01:37 PM
Apple's own website at one point showed the 8600GT as having double the performance of the 9600M. The difference is the 8600GT is more power hungry and less flexible in its programming. Also, your 8600 is getting such good performance because it is somehow over clocked - most ran at just 900MHz while yours is at 1040. So yes - the 8600GT can smoke most modern laptop GPUs - but it caused an over heating nightmare for Apple.

This is interesting - my 2007 MacBook Pro 17" 2.6GHz (MacBookPro3,1) produces the following results:

Number of OpenCL devices found: 2
OpenCL Device # 0 = GeForce 8600M GT
Device 0 is an: GPU with max. 1040 MHz and 32 units/cores
Now computing - please be patient....
time used: 2.368 seconds

OpenCL Device # 1 = Intel(R) Core(TM)2 Duo CPU T7800 @ 2.60GHz
Device 1 is an: CPU with max. 2600 MHz and 2 units/cores
Now computing - please be patient....
time used: 14.080 seconds

Which seems to be better performance than some of the newer similar-spec models. Perhaps caused by the 32 unit/cores on the 8600 vs the 9400?

[)amien