NVIDIA Working on GPGPUs for Macs?

Cromulent · Jan 25, 2008

haunebu said:
In the olden days, we greybeards called this AltiVec.

This is nothing like AltiVec.

Edit : AltiVec was a way to pass 128bit instructions through the CPU in 1 cycle rather than having to split them up into separate instructions. This is a method of passing instructions to the graphics card which execute them faster and in parallel.

takao · Jan 25, 2008

this is nothing like altivec .... looks like the greybeards don't know what they are talking about

in fact similiar stuff has already been able before using custom C libraries etc. but it was mostly too difficult to do and too tied to plattforms (thus nobody invested serious money into it)
speedup for scientific apps can be huge if the amount of parallelism can be really used

thou actually i would think the great breakthrough will come once the GPu can be directly intigrated into the operating as an another "pseudo - cpu" which would make it much more easier to get the performance out of it... or at least try to get MPI/OpenMP to work with it

edit: to those saying "why not add a second CPU"
well the GPU has to be there anyway on a desktop machine and most of the time will be sitting around idling when no 3D apps are running so why not use it for something ...
also GPUs are more parallel than another CPU and much, much better at floating point operations

Evangelion · Jan 25, 2008

haunebu said:
In the olden days, we greybeards called this AltiVec.

Except Altivec was few orders of magnitude slower... Altivec is for all intents and purposes identical to SSE. Sure, it was better at somethings while being worse at some others. But the basic idea was/is identical. So if you can compare Altivec to this, you can compare SSE as well. And I don't think you can do that, since there are significant differences between this and SSE/Altivec.

Azurael · Jan 25, 2008

winterspan said:
Single prec is 32-bit, double is 64. I believe only the new "dedicated" GPGPU cards from Nvidia support double precision CUDA.

Although if that's the case, it's obviously an intentional software limitation as these cards use exactly the same GPU as their video output-equipped counterparts.

blackcrayon · Jan 25, 2008

After the intel switch, I thought Apple might be thinking of putting something like this inside every mac, to sort of "differentiate" the hardware from the more generic PCs... Sortof like the old Quadra AVs with their DSP chips on board. Of course I wouldn't want it to increase the price of every mac by another $999, but it would be cool to be able to "expect" a processor like that on board... But I guess it would be just as easy to "expect" it on the graphics card. Apple's advantage is being able to much more quickly leverage stuff like that at the OS level, since they can get OS updates out so much more quickly than the main competition...

avkills · Jan 25, 2008

Cromulent said:
This is nothing like AltiVec.

Edit : AltiVec was a way to pass 128bit instructions through the CPU in 1 cycle rather than having to split them up into separate instructions. This is a method of passing instructions to the graphics card which execute them faster and in parallel.

This guy needs more grey!

AltiVec was 128bit wide, but it was usually used to process 4 32bit vector calls per cycle. So yes this is like AltiVec but with a lot more parallelism.

-mark

headfuzz · Jan 25, 2008

So am I right in thinking this is along the lines of the x87 maths coprocesssor architecture from the 80s?

diamond.g · Jan 25, 2008

headfuzz said:
So am I right in thinking this is along the lines of the x87 maths coprocesssor architecture from the 80s?

It is probably closer to CELL than to a Coprocessor. Think of many coproccessors on one die running in parallel real fast.

kps · Jan 25, 2008

And the Wheel of Reincarnation turns another ten degrees....

(Note the date on the citation)

tapaul · Jan 25, 2008

CUDA is unfortunately , currently of no relevance to either software users or developers.Its been "out" for a while now for PC, and there is pretty much zero take up and/or interest in it, or its SDK.
It is a great idea, but for it to happen, there has be a range of killer mainstream applications, demonstrating why, beyond all doubt, the end user must have CUDA.
Nvidia recently purchased Mental Images , which might just be the where the CUDA puzzle starts to come together.Perhaps this is NV's way of saying "well if you won't do it then we will do it for you".However , I suspect they will have to make a few more key acquisitions before people start to take notice.

the key problem is that CUDA is not and will never be, compatible with any ATI (or other vendors) offerings.This is, imho, the kiss of death for this technology. For this type of streaming process to succeed, there needs to be a common API , (much like OpenGL for graphics).Which any hardware vendor can produce a driver for.Then the software developer can be sure that if they use the API, it wont be a waste of time and effort.

cheers

iSee · Jan 25, 2008

Abstract said:
This GP GPU sounds like another CPU. I mean, mathematical tasks such as image and sound processing? If they're really interested in another CPU, just add another CPU. It doesn't have to be from NVIDIA. It can be from Intel, no?

I don't really get it. Or is this just like what AltiVec used to be?

This is really an alternative to things like Altivec. It is a really good development to my mind because it might be better. If the GPU guys apply all they know about highly-paralllelized computation, maximizing throughput, etc. to more general computation, they might make some incredible things possible.

It's also interesting because it decouples the coprocessing unit from the CPU, which could allow for more flexible, customized computing environments.

There are some interesting problems just dying for coprocessors and this might be a way to address that. For example, there was talk at one point about "Physics" coprocessors. There's been some work in that area, but nothing mainstream really caught on. Some other possibilities: AI processors. Even sound processing seems to be moving back to the CPU. Individually, these things don't seem to create a viable market for a coprocessor. But a most general purpose coprocessor might help some of this stuff to take off.

On the other hand, it seems likely that computers will be getting more and more CPU cores, which can be utilized for the same purposes.

So we've got a competition between the CPU guys and the GPU guys to provide us with the fastest, most useful, cheapest processing power. And that's got to be good for us users.

iSee · Jan 25, 2008

kps said:
And the Wheel of Reincarnation turns another ten degrees....

(Note the date on the citation)

That's funny.

But possibly not true anymore. Typical systems have had both a CPU and GPU for a long time now. GPUs are now making a bid to take on more of the general purpose computing tasks, so really the wheel might be rolling backwards.

However, tapaul is right (look back a few posts). If there isn't a common API that developers can feel confident in, they simply won't spend their development resources to support a coprocessor. It will be a tiny and expensive niche product that will never turn in to anything cheaper and more widely available.

gkarris · Jan 25, 2008

ricosuave said:
I knew this day would come! I have no idea of what you guys are talking about.

Whatever it is they're talking about, I hope we can get it in the Mini...

exabytes18 · Jan 25, 2008

Tesla is expensive, but CUDA's not limited to just Teslas. You can execute it on any 8-series graphics cards. I don't think it was ever intended to be a mainstream technology, at least that's the sense I get from the marketing of it. Your average Joe is never going to use 300 GFLOPS.

It's there for those who need such compute power. Just because it's not supported by major software vendors doesn't mean that there's not any developers using it.

Cloudane · Jan 25, 2008

Here was me hoping it was a Game Player's GPU

(Actually for my uses I don't mind the one in the iMac too much but still)

MacFly123 · Jan 25, 2008

I ran some pretty hardcore diagnostics on these suckers and I couldn't believe how snappy Safari was

hehe!

hugodrax · Jan 25, 2008

SirOmega said:
The only issue I have is that I think the current GF8 series of cards are only FP16 - 16 bit floating point aka single precision. This may not be enough for most apps. It would be great for running test runs, but I know for scientific calcs FP16 isn't sufficient, they need FP32.

You really need 64bit precision. I thought the GF8 did 32bit precision for its GPGPU work.

hugodrax · Jan 25, 2008

Cromulent said:
This is nothing like AltiVec.

Edit : AltiVec was a way to pass 128bit instructions through the CPU in 1 cycle rather than having to split them up into separate instructions. This is a method of passing instructions to the graphics card which execute them faster and in parallel.

Wow, being clueless there. Altivec and SSE is the same crap, Single Instruction Multiple Data. Each implementation have their own strengths and weakness but overall they provide the capability to do high performance Vector calculation. Stuff that required multimillion dollar Crays.

hugodrax · Jan 25, 2008

tapaul said:
CUDA is unfortunately , currently of no relevance to either software users or developers.Its been "out" for a while now for PC, and there is pretty much zero take up and/or interest in it, or its SDK.
It is a great idea, but for it to happen, there has be a range of killer mainstream applications, demonstrating why, beyond all doubt, the end user must have CUDA.
Nvidia recently purchased Mental Images , which might just be the where the CUDA puzzle starts to come together.Perhaps this is NV's way of saying "well if you won't do it then we will do it for you".However , I suspect they will have to make a few more key acquisitions before people start to take notice.

the key problem is that CUDA is not and will never be, compatible with any ATI (or other vendors) offerings.This is, imho, the kiss of death for this technology. For this type of streaming process to succeed, there needs to be a common API , (much like OpenGL for graphics).Which any hardware vendor can produce a driver for.Then the software developer can be sure that if they use the API, it wont be a waste of time and effort.

cheers

Eventually it will happen. But the beauty of OS X is you are already doing GPGPU stuff on your mac. Like using core image etc.. and Aperture. etc..

diamond.g · Jan 26, 2008

hugodrax said:
Eventually it will happen. But the beauty of OS X is you are already doing GPGPU stuff on your mac. Like using core image etc.. and Aperture. etc..

That still counts as drawing something (in both cases), not really general purpose as a GPU is designed to draw stuff. GP is more like Folding or the SETI project.

SPUY767 · Jan 26, 2008

Abstract said:
This GP GPU sounds like another CPU. I mean, mathematical tasks such as image and sound processing? If they're really interested in another CPU, just add another CPU. It doesn't have to be from NVIDIA. It can be from Intel, no?

I don't really get it. Or is this just like what AltiVec used to be?

GPU's are already capable of vastly more processing power than today's CPU's. The problem is that the GPU's are very specific in what they can do. an 8800GT has 112 Pixel pipelines, you can view that as 112 seperate thread execuition paths. It can run highly specialized processes like a 112 core processor. GPU's only run in the 6-700 MHz range, but with parallel processing capability like they have, who needs clock speed. With AMD snatching up ATi, I see intel potentially forming a quasi-partnership with nVidia, with the combination of a processor like the Yorkfield, and a GPU that is GP capable, bringing the next wave of computer graphics. On-Demand ray-tracing.

skunk · Jan 26, 2008

SPUY767 said:
but with parallel processing capability like they have, who needs cock speed.

Are we in the right thread?

SPUY767 · Jan 26, 2008

skunk said:
Are we in the right thread?

The "L" key on my keyboard is damnable. It only works about half the time.

Analog Kid · Jan 26, 2008

This is essentially an acknowledgement by nVidia that there's no room for growth in their core business so they're hoping their name will give them an opportunity to sidestep into a new business. The problem is they're going it alone. As long as CUDA is a proprietary API, nobody is going to adopt it. Neither Dell nor Apple want to tie themselves and their high end applications to another single source.

If a standard emerged, I could see this going somewhere in the short term-- until Intel woke up and built their own Cell. In the GPU space, the video cards are largely abstracted through either OpenGL or DirectX. If nVidia or ATI disappeared as so many have before them, the OS and application vendors wouldn't blink. They keep writing to the same interface and new hardware picks up the load.

Here, Apple (or whoever) would need to mark in their system requirements: "Requires an nVidia CUDA processor". So now you need a specific Intel CPU and a specific nVidia GPU. They won't do it.

Frankly, it's silly to have all that silicon sitting out on a separate card doing nothing unless you're running the absolute latest games with updated drivers. nVidia is realizing that. AMD figured it out a while ago and bought ATI when they had the chance.

SirOmega said:
The only issue I have is that I think the current GF8 series of cards are only FP16 - 16 bit floating point aka single precision. This may not be enough for most apps. It would be great for running test runs, but I know for scientific calcs FP16 isn't sufficient, they need FP32.

Are you familiar with char, short, long, long long, float, double, long double? All of those types exist on all processors-- but may be decomposed into smaller sub-operations if the processor doesn't support the width natively. Same would go here, I presume. From nVidia's standpoint, there's no need for anything wider than 16bit calculations. They could put 64bit data paths in but then all those extra bits would be wasting silicon and power for any calculation with less precision. Make the little pieces run fast and the people who need more precision will still see the benefit.

Cromulent said:
This is nothing like AltiVec.

This is exactly like AltiVec. It's bigger and it's off chip, but it's the same idea: fast, dedicated, vector processing.

diamond.g said:
It is probably closer to CELL than to a Coprocessor. Think of many coproccessors on one die running in parallel real fast.

Careful, I've brought this up in other threads, and people get their panties in a bunch about it... Because Apple went Intel rather than Cell they think Cell is a dead concept. Cell is exactly where we're all heading. Because of inertia, we may see this wasteful division of labor for a while before the cost of doing it this way is too prohibitive, but eventually it'll all get brought to the motherboard and then into the chipset, then into the main processor.

Mac Pros have 8 cores on them now-- and each core is wasting a huge amount of logic. Each CPU is handling a single purpose thread, but carrying all the logic necessary for any type of thread that might be thrown at it. I've got a whole core handling a stream of integers coming off the network, but I've got all the floating point and SSE logic sitting there idle. Meanwhile I'm compressing a video stream and only have one SSE unit available to that thread.

And through all this my high powered GPU is being taxed with nothing more than a progress bar...

imbored · Jan 26, 2008

I wonder if macs will support the 9000 series when it comes out

NVIDIA Working on GPGPUs for Macs?

macrumors 604

macrumors 68040

macrumors 68040

macrumors regular

macrumors 68020

macrumors 65816

macrumors 6502

macrumors G5

macrumors regular

macrumors newbie

macrumors 68040

macrumors 68040

macrumors G3

macrumors 6502

macrumors 68000

macrumors 68020

macrumors 65816

macrumors 65816

macrumors 65816

macrumors G5

macrumors 68020

macrumors G4

macrumors 68020

macrumors G3

macrumors member

Our Staff