PDA

View Full Version : Apple to Offer Cluster Rendering?




MacRumors
May 12, 2004, 07:07 AM
Thinksecret reports (http://www.thinksecret.com/news/prorendering.html) that Apple will soon bring distributed, cluster-based rendering to its line of pro video applications.

This distributed rendering would allow users to take advantage of a cluster of networked computers.

Such capabilities appears as if it would advantageous to both larger cluster owners as well as home users. One example offered is the use of two Powerbooks in the field.



BurntCalc
May 12, 2004, 07:16 AM
About time! This is gonna be huge for editors. Final Cut and all the apps will have a significant edge over the competition. Since this is my business, I'm really excited.

oliverlubin
May 12, 2004, 07:16 AM
how does xgrid come into play with this idea? is it just an off-shoot of that?

173080
May 12, 2004, 07:28 AM
Nice, more reasons to get more Macs. :D

BornAgainMac
May 12, 2004, 07:33 AM
This is an excellent way to sell more Macs and you can have tomorrow's Mac today as far as rendering speeds.

T'hain Esh Kelch
May 12, 2004, 07:35 AM
how does xgrid come into play with this idea? is it just an off-shoot of that?
Im pretty sure its based on that technology.

This is REALLY sweet guys!

aussiemac86
May 12, 2004, 07:39 AM
This sounds pretty cool, does anyone have an idea of like what percentage of processing power you can utilise over a ethernet connection though.

ClimbingTheLog
May 12, 2004, 07:40 AM
how does xgrid come into play with this idea? is it just an off-shoot of that?

Sure - it's inevitable that all of Apple's CPU-intensive apps will get gridded. Probably 10.4 will have all the hooks necessary for iMovie, et. al. to be gridded too.

Of course if you're not on gigabit ethernet some things might not make sense to grid.

Mr. Anderson
May 12, 2004, 07:41 AM
This is fantastic! I wonder if it will just be for Apple's apps or we'll be able to use it for 3D animation as well...:D

D

iGav
May 12, 2004, 07:50 AM
It's about time... Apple really needs this for FCP, as it's totally nailed by Media 100's 844/X, the lack of Qmaster is holding FCP back purely because of the lack of rendering performance.

What I'd like to see is Apple to offer some kind of inbuilt hardware acceleration or an Apple designed Magma style expansion chassis, capable of running multiple processor acceleration cards without the need to buy several PowerMac G5's or Xserves for dedicated editors and motion graphics people.

Then we'd see FCP becomming more popular, and certainly challenging the high end Avid and Media 100 systems.

iGav
May 12, 2004, 07:51 AM
This is fantastic! I wonder if it will just be for Apple's apps or we'll be able to use it for 3D animation as well...:D

D

Qmaster already handles Maya ;)

KC9AIC
May 12, 2004, 07:52 AM
Distributed computing is definitely a good thing for Apple. Can you imagine a couple of people in a school computer lab editing video at blazing rates because they are borrowing processing power from the kid checking his email? It will eliminate wasted power, and mean that you can no longer annoy everyone else in the lab by playing Tetris for hours on the fastest computer they have. :rolleyes:

thatwendigo
May 12, 2004, 08:09 AM
Yet, when xSan debuted and I made a nice, long post about how Apple was going to go big on ditributed computing, nobody batted an eye... Lovely. :rolleyes:

This really fits in with the direction that their products have been taking, especially since the release of the xServe and xServe RAID, along with keeping gigabit ethernet standard on pro machines. What you're going to see in design and art houses now is a front end with G5 towers that do the basic modeling and setup, storing the files off on a behind-the-scenes TB drive that's shared to them all. When done, you send your task to the render farm and let it churn it and spit it back onto the company's RAID array. From there, you access it using your FibreChannel card and make any necessary corrections, feeding it back through the render queue if necessary.

My next prediction:
In the next two years, apple will offer a home-office type device with similar functionality but less speed. It will be a chassis running IPoverFireWire and 10/100/1000 Ethernet (possibly even 802.x wireless), with room for, say, four hard drives on an SATA bus. There will be an embedded controller running a derivative of xSan, and it will cost arond $300-500 for the device. The size will be just a bit bigger than LaCie's Bigger Disk, and it will function as a simple network hub so that you don't really need anything other than a cable/DSL/satellite modem.

ccuilla
May 12, 2004, 08:25 AM
This would be pretty nifty for times when my wife is doing some iMovie stuff...and my laptop is just sitting there doing nothing. Off-load some rendering across the network. Too sweet. Let's hope it applies to consumer level apps too.

Mr. Anderson
May 12, 2004, 08:33 AM
Qmaster already handles Maya ;)

I'm on Lightwave and there is a distributed rendering system that comes with it - but its a pain to set up.

How does Qmaster work?

D

Windowlicker
May 12, 2004, 08:49 AM
what can I say.. this is just great! wonder how it works over an ethernet network. I mean, how much does it affect it compared to say fibre channel?

toontra
May 12, 2004, 08:49 AM
Would it be possible to include pro-audio apps (eg Logic) in this? The idea of having 2+ comps clustered whilst mixing would be great - some of the new 3rd-party virtual instruments and convoluted reverbs are mighty CPU-hungry!

gileschin
May 12, 2004, 08:59 AM
This is amazing! All this technology! Fantastic capabilities! Way ahead! Only Apple can do it!

whooleytoo
May 12, 2004, 09:00 AM
This must be great news for budget-sensitive companies/institutions, especially when it comes to upgrading.

If a lab of media usage Macs are too slow, instead of having to upgrade the lot, just add a couple more G5's to the grid. Cheaper, and more flexible.

trebblekicked
May 12, 2004, 09:00 AM
good news, provided it's true, and i don't see why it wouldn't be. apple's been throwing a lot at pro vid users this year, and they seem to be thinking a lot about productivity and efficiency. it's nice to see them treating FCP as part of an overall system, not just a standalone piece of software.

GrannySmith_G5
May 12, 2004, 09:06 AM
Yet, when xSan debuted and I made a nice, long post about how Apple was going to go big on ditributed computing, nobody batted an eye... Lovely. :rolleyes:

Stop being so modest. Don't worry, you can post your opinion with as much confidence as you wish. No one will think you are an ego maniac.

FlamDrag
May 12, 2004, 09:24 AM
Wow, this would be great if I can get some extended use out of my old Pismo and Sawtooth with my new(ish) powerbook.

This is the first thing in a long time that I want NOW. My imagination runs wild...

thatwendigo
May 12, 2004, 09:28 AM
Would it be possible to include pro-audio apps (eg Logic) in this? The idea of having 2+ comps clustered whilst mixing would be great - some of the new 3rd-party virtual instruments and convoluted reverbs are mighty CPU-hungry!

I'd say it's likely to come to fruition soon, when xGrid is nailed down and the plugin architecture has been finished and is ready. Then, Apple can release plugs for their major pro apps, and you can have the client machines churn away at secondary tasks while you work.

Stop being so modest. Don't worry, you can post your opinion with as much confidence as you wish. No one will think you are an ego maniac.

Hey, I can't help it if I was right about both the Centrino on desktop (which people dismissed) and xSan meaning we'd see even more clustering apps (which people didn't respond to). Now, to see if I can go on with my iMac predictions...

MongoTheGeek
May 12, 2004, 10:34 AM
Its already in XCode so moving it to other processor intensive programs shouldn't be too rough.

Earendil
May 12, 2004, 11:09 AM
This is going to be awesome! Already my 1.25PB has to wait through iMovie rendering. Yet, within this house my Dad has a 867PB, and my mom has an 867G4 Tower (which she uses for email! :eek:). my little siblings also have a little rev b iMac running at 233mhz, I culd plug that into the rendering network just to cheer the other computers on :D

G4 1.25gh
G4 867
G4 867
-------------
G4 2.9ghz

No, it won't be the same, but let me dream ;)
Oh to have the PC friends over and show them how "blazing fast" my PB is at rendering. I can mention a bit later, on a completely different note, that I can network the rendering tasks :D

Now can we get this sweet technology working on the current line of games? :rolleyes:

Earendil

dizastor
May 12, 2004, 12:01 PM
SWEET!

Must... save... for... multiple... g5s....

seriously though, this ROCKS! I really hope that this gives a signigicant speed increase... I would love to not spend half of my time waiting for renders while editing my movies.

spankalee
May 12, 2004, 01:11 PM
Would it be possible to include pro-audio apps (eg Logic) in this? The idea of having 2+ comps clustered whilst mixing would be great - some of the new 3rd-party virtual instruments and convoluted reverbs are mighty CPU-hungry!

Audio is a different story because there it's mostly real-time, there isn't much rendering, so latency becomes a huge issue.

There are already dedicated hardware products that offload audio processing like the TC PowerCore and UAD-1. Again, latency is the issue here because effects outside the Mac introduce latency. Correcting this becomes a pain.

As for rendering, Logic introduced a feature called "Freeze Tracks" where the effects are rendered to a temporary file so that you don't have to use processing power to play the track with effects. Freeze happens pretty quick, so I don't think people are clamoring for distributed freezing. What audio types need is a 3.0Ghz G5+ with a 1.5GHz bus and 8Gb of RAM.

appleface
May 12, 2004, 01:52 PM
why spend the money on two pb's when you're only going to use the brains and not the keyboard, optical drive, screen, etc. in the second one? that would be a waste--a waste of space, money, hardware. when is the portable cluster node composed of two dual G4s coming out? that's 4 x 1.5 ghz. msrp $4500.

sethypoo
May 12, 2004, 04:08 PM
Neat! Uber-fast rendering.....maybe someday it'll come to iMovie? :p ;) :rolleyes: :)

oingoboingo
May 12, 2004, 05:24 PM
Hey, I can't help it if I was right about both the Centrino on desktop (which people dismissed) and xSan meaning we'd see even more clustering apps (which people didn't respond to). Now, to see if I can go on with my iMac predictions...

The appearance of the XServes (especially the cluster-node version), Xgrid, and things like distributed builds in XCode might have also been obvious giveaways that Apple was heading in this direction. Centrino on the desktop has been discussed on PC forums almost since the day it was announced for notebooks. Congratulations on making some good picks, but these events weren't really 'bolts from the blue'. Also (and I'm not being sarcastic here), if you want people to read and remember the predictions and opinions you're writing, maybe you should set up your own web site to host them. Posting in discussion forums is a bit like sending in a doctoral thesis to be published in the classifieds section of the local newspaper. No-one is going to notice it amongst the 10,000 other posts, no matter how brilliant (or full of crap) it may be.

Now for a *real* challenge, try and predict what Steve will be wearing for the keynote at the WWDC.

PaisanoMan
May 12, 2004, 06:07 PM
When the article says this is coming to pro video applications, I hope they really mean Final Cut Pro. Shake and Compressor both already utilize distributed rendering, and I'm assuming that Motion does as well (but haven't used it).

If I could edit on my PowerBook and then render hi-def output with a small Linux cluster, I'll be in wishful-thinking heaven. :)

qubex
May 13, 2004, 12:39 AM
XGrid as it currently stands is a truly marvelous technology preview product. One can reasonably assume that any distributed computing technology made by Apple will somehow find its roots in the XGrid paradigm. However, XGrid suffers from a serious limitation - as do all distributed computing systems I have seen to date. It is not a flaw easily dismissed, or acknowledged en passant only to brush it off later being somehow solved by Apple's brilliant engineering team. The problem is data, and the bandwidth of the interconnecting network infrastructure.

Those of you who have tried the XGrid technology preview, or more generally are involved in distributed computing projects (such as the erstwhile RSA keysearch, SETI@Home, Folding@Home, etc.) have probably noticed that all of the tasks approached are situations where a relatively small amount of data requires massive amounts of calculations to be performed on it. Thus for a compartively small "wait" while transferring data, one can distribute small packets of data to independent computers for processing in parallel. Furthermore, parallelising tasks over "slow networks" is only feasible if each packet of data can be processed independently from all others, because otherwise one will incur very serious delays when transacting over the network. What do I mean by "slow networks"? Sadly, anything slower than Infiniband is "slow" for the purposes of high-performance parallel processing. Thus even Gigabit Ethernet and FibreChannel are "slow" for the purposes of massively parallel processing.

What is the point of "farming out" complex video transitions if each computer must wait for the previous one to finish and "hand off" the data? What is the point of bothering to transmit a couple hundred megabytes of HD video begin and end frames for a transition when you can probably compute the transition more quickly on your own box? Both entail waiting, but the latter could potentially entail less waiting than the former, and certainly entails less infrastructure. Anybody who doubts the difficulties of parallelisation need look no further than that supposedly optimised resource hog and bullwark of the Apple design bureau: Photoshop. How many filters are dual-processor aware? More to the point: how many are not dual-processor aware? And that is within the confines of a single machine, with basically zero latency issues. I rest my case.

I do not wish to rain on anybody's parade, but I do not see this particular rumour bring much import to most professional Mac users' lives. Certainly not for audio, where latency is a serious issue. Certianly not for professional video editors using G5s on run-of-the-mill half-duplex 100baseT networks. Neither do I expect this technology to percolate down to consumer-level applications: though you may percieve Apple to be benevolent, remember that at heart Apple - nay, APPL - is a corporation seeking a profit. They have clearly identified cluster computting as being a significant target - witness the sale of the G5 XServe "processor blades" designed explicitly for distributed computing, and the recent introduction of the XSan network storage solution.

XSan and XGrid, running on RAID XServes and G5 processor-blade XServes respectively, clearly complement each other and form the two prongs of a concerted attack. This much is certain and evident. By comparison, the Big Mac massively parallel cluster was only the beginning: Apple now truly has all the ingredients necessary to become a serious player in the high-performance/high-reliability computing market. Enticing professional video editors is only one aspect of this policy - and, I expect, not a cornerstone. At the rate technology increases performance (yes, even at Apple's slow release rates), the time taken for a given render falls by half every year (if the increase from 2GHz to 3GHz "by summer" is to be believed), more-or-less in-keeping with Moore's Law. Somebody remarked that now it will be possible to have tomorrow's mac's rendering speed today. It also means that come tomorrow, you won't need that grid anymore. Apple wants you to buy PowerMacs, and it wants to you to keep on buying them. Longevity of your investment means lost revenue to them. Simple as that.

I noticed somebody got excited about the prospects of this technology somehow making its way into iMovie. This is almost certainly not the case: iMovie will not support distributed rendering. Even in its fourth incarnation, iMovie is still an inefficient Carbon app that does not even include support for multiple threads running concurrently on dual processor machines. It could not possibly be "upgraded" to run in XGrid-distributed style without a major rework. This is not corporate oversight on Apple's part: it is a sound business strategy, since their analysts are very careful to avoid undercutting their own offerings. Final Cut Pro for the professional market, Final Cut Express for the prosumer market, and iMovie for the lowly consumer. Same goes for all the other Apple software offerings. They may share technology in some select places, but there is clearly an active effort to maintain the highly profitable market segmentation currently extant.

So, overall, an interesting development, but hardly one that "Changes Everything".

xStep
May 13, 2004, 02:07 AM
This is going to be awesome! Already my 1.25PB has to wait through iMovie rendering. Yet, within this house my Dad has a 867PB, and my mom has an 867G4 Tower (which she uses for email! :eek:). my little siblings also have a little rev b iMac running at 233mhz, I culd plug that into the rendering network just to cheer the other computers on :D

G4 1.25gh
G4 867
G4 867
-------------
G4 2.9ghz


You forgot the rev B iMac.

That will give you 3.2Ghz of horsepower. You can tell your friends your little brother has an 'old' 3.2GHz iMac. :D

This technology depends on being able to break up the coding. Currently I'd think gaming is doubtful due to the realtime response demanded by the user. Sure you can break up the rendering of the frames, but they'd have to be returned extremely quickly for you not to feel the lag.

qubex
May 13, 2004, 02:46 AM
Videogame frames are typically rendered by the GPU, so videogame performance would certainly not be affected by this (potential) technology.

The inherent slowness of CPUs when compared to GPUs for graphical calculations, coupled with the slow networks, make the proposition simply laughable.

Running physics and AI engines in some kind of clustering environment may be possible, but would you really want to sacrifice ping times for free cycles?

whooleytoo
May 13, 2004, 07:35 AM
What is the point of "farming out" complex video transitions if each computer must wait for the previous one to finish and "hand off" the data?

Surely, that depends on the effect in question. With some transitions, it might equally be possible to use a (say) four node cluster by dividing the screen into quarters, each node rendering its portion independently of the others. Especially for lengthy transitions, this would greatly benefit from clustering.

jeffbistrong
May 13, 2004, 08:34 AM
It's about time... Apple really needs this for FCP, as it's totally nailed by Media 100's 844/X, the lack of Qmaster is holding FCP back purely because of the lack of rendering performance.



What is Qmaster?

X_Entity
May 13, 2004, 08:39 AM
A great thing if they support rendering on other architectures. In any given building there are likely to be more x86 boxen than macs. Linux based render farms on x86 boxes are far cheaper to implement and have proven track record on a number of CG animated films.

qubex
May 13, 2004, 08:57 AM
Surely, that depends on the effect in question. With some transitions, it might equally be possible to use a (say) four node cluster by dividing the screen into quarters, each node rendering its portion independently of the others. Especially for lengthy transitions, this would greatly benefit from clustering.
You are correct - subdividing the frame into sections is an obvious way of approaching the problem. This will work, for example, in a fade-out. However, consider something like a "diffused gaussian blur-out": there is an "information leakage" at the interface between the sections as the pixel's values are averaged with those of its neighbours, resulting in the need to obtain information from other computers. This in turn requires network transactions, incurring latencies. You will find that this is by far the general case. The same goes for the oft-overlooked audio component of video edits: the rendering of echos etc. will also require internode communication if the audio data is linearly subdivided.

In mathematical terms, internode communication (and resulting slowdowns) is the general case. It is only a highly specific subset of circumstances that do not require such communication.

A great thing if they support rendering on other architectures. In any given building there are likely to be more x86 boxen than macs. Linux based render farms on x86 boxes are far cheaper to implement and have proven track record on a number of CG animated films
You can't be serious. How would Apple profit from such a move? They are, after all, a hardware company that wishes to sell you hardware. Allowing you increase the speed of their precious software without having to purchase their hardware makes no commercial sense.

raven13mb
May 13, 2004, 10:07 AM
How exactly would something like this work? I do wedding video's in FCPHD and I have two G5 1.8's in front of me. The one is a refurb I just picked up (which has a loud fan--it rev's up way more than my other 1.8...which really is whisper quiet all the time-- any suggestions on that?). Feedback on how the cluster would work would be great!

qubex
May 13, 2004, 10:30 AM
How exactly would something like this work? I do wedding video's in FCPHD and I have two G5 1.8's in front of me. The one is a refurb I just picked up (which has a loud fan--it rev's up way more than my other 1.8...which really is whisper quiet all the time-- any suggestions on that?). Feedback on how the cluster would work would be great!
Basically, it would work something like this:

(1) You initiate a render on your "main" G5 machine. Your FCP-HD session looks at what it has been told to render, say a cross-fade dissolve, and figures it is amenable to parallelisation. It sends out a network query (using Randezvous) for machines "willing and capable" to help. Your "other" G5 responds.
(2) Since it recieved one reply, it divides the fames in two, keeps half, and sends half of the beginning and ending frames to your other machine. They then both work on the video concurrently.
(3) At some point, your computer finishes rendering, and stores away the result. Either before or after that, it recieves the result from the other machine. (Asynchronicity problem.)
(4) When they have both finished, your main machine recomposes the two half-frame sequences and presents you with the result.

Of course this is a gross oversimplification. In particular, since we're talking of a cross-fade, it wouldn't be a single start- and end-frame, but rather a video clip. But you get the general idea. "Divide and Conquer."

"A hundred ants can steal your picnic but if you tie a hundred bees to a brick it won't fly."

whooleytoo
May 13, 2004, 11:55 AM
You are correct - subdividing the frame into sections is an obvious way of approaching the problem. This will work, for example, in a fade-out. However, consider something like a "diffused gaussian blur-out": there is an "information leakage" at the interface between the sections as the pixel's values are averaged with those of its neighbours, resulting in the need to obtain information from other computers. This in turn requires network transactions, incurring latencies.

Again, I think a clever algorithm could still work around most (but obviously not all of this latency).

Each node could split its dataset into two. First, process the pixels dependent on other screen segments/nodes; and send to those other nodes. While awaiting a response, process the pixels which independent of other nodes. When data is received back from the other nodes, merge the two parts by performing the blur along the borders of the two parts of the dataset.

I believe for the example mentioned, each node would only need a limited amount of data (a thin border of pixels) from the other nodes, so bandwidth wouldn't be a problem - though latency still would. I guess the more complex the transition, the less of an issue the latency overhead would be.

qubex
May 13, 2004, 12:39 PM
Mine was only a rapid example. Of course there do exist algorithms to correct the problem, in the specific instance. Say information "bleeds" by 2 pixels/frame, and the transition is 25 frames (1 sec.): each node would get an extra "frame" of 50 pixels around its allocated work unit so as to provide the necessary data, removing the need for network queries.

I was only trying to illustrate the broder problems. But you are perfectly right that in this particular instance, a shortcut exists.

xStep
May 14, 2004, 02:08 AM
Videogame frames are typically rendered by the GPU, so videogame performance would certainly not be affected by this (potential) technology.

The inherent slowness of CPUs when compared to GPUs for graphical calculations, coupled with the slow networks, make the proposition simply laughable.

Running physics and AI engines in some kind of clustering environment may be possible, but would you really want to sacrifice ping times for free cycles?

Check this story out. It was a matter of time... http://arstechnica.com/news/posts/1084398037.html

qubex
May 15, 2004, 11:23 AM
Impressive and scary, I know. But while liking two graphics accelerators on a single high-speed symmetric bus (notice it uses PCI-Express and not AGP8X, which is asymmetric), doing the same over a relatively slow network, even when running at 1Gbit/sec, is not feasible.

h'biki
May 16, 2004, 08:37 PM
It's about time... Apple really needs this for FCP, as it's totally nailed by Media 100's 844/X, the lack of Qmaster is holding FCP back purely because of the lack of rendering performance.

What I'd like to see is Apple to offer some kind of inbuilt hardware acceleration or an Apple designed Magma style expansion chassis, capable of running multiple processor acceleration cards without the need to buy several PowerMac G5's or Xserves for dedicated editors and motion graphics people.

Then we'd see FCP becomming more popular, and certainly challenging the high end Avid and Media 100 systems.

Its pretty damn popular already and helped killed Media 100.

I don't think a hardware accelerator is going to be cost effective for Apple. Expensive R&D, very limited market, catered to by a number of third parties. A scalable render architecture based purely on your mac boxes will appeal a lot more to post houses I think -- because the benefit of a clustered render farm extends to multiple editors and not just whoever is 'senior' enough to get the accelerated machine.

Lotring
May 28, 2004, 08:19 AM
I'm on Lightwave and there is a distributed rendering system that comes with it - but its a pain to set up.

How does Qmaster work?

D


Lightwave has a program called "Screamernet" built into it and I set it up to work with 20 computers in a lab. there are tutorials everywhere online, give it a google search sometime. the render times are drastically cut especially when you render scenes with 1000 frames or more in which there are particle emitters and lots of surfaces. goodluck with getting it to work.