Become a MacRumors Supporter for $25/year with no ads, private forums, and more!

FBDIMM Support On Intel PowerMac/XServe?

MacRumors

macrumors bot
Original poster
Apr 12, 2001
51,584
13,214
https://www.macrumors.com/images/macrumorsthreadlogodarkd.png

According to this Intel workstation marketing pamphlet (pdf), Intel's new workstation platform will support Fully Buffered DIMM system memory technology (FBDIMM).

FBDIMM technology allows for greater memory capacity and higher bandwidth due to the addition of an Advanced Memory Buffer (AMB) on each DIMM board. Throughput is theoretically increased 300% from previous-generation DDR2-400 memory, and memory capacities up to 64 GB are supported.

Intel was previously rumored to be contracted with designing the next-generation PowerMac/Mac Pro's motherboard design.
 

Doctor Q

Administrator
Staff member
Sep 19, 2002
38,449
4,963
Los Angeles
Using serial communication the number of wires needed to connect the chipset to the memory module is lower and also allows the creating of more memory channels, what [sic] increases memory performance.
My intuition was completely backwards. I would have thought that parallel memory controller communication would be faster. No wonder I'm in the software field, not the hardware field!
 
Comment

Nermal

Moderator
Staff member
Dec 7, 2002
19,027
1,488
New Zealand
Doctor Q said:
My intuition was completely backwards. I would have thought that parallel memory controller communication would be faster. No wonder I'm in the software field, not the hardware field!

Hehe, remember that serial ATA is a faster interface than parallel ATA :)
 
Comment

macshark

macrumors member
Oct 8, 2003
96
0
Doctor Q said:
My intuition was completely backwards. I would have thought that parallel memory controller communication would be faster. No wonder I'm in the software field, not the hardware field!

Your intuition is correct. FBDIMMs are not faster. The current bus based memory interfaces can only support 2 to 4 DIMMs at high speeds. FBDIMMs simply enable the same amount of data to be carried over smaller number of wires and thus make it possible to attach more memory (i.e. more DIMMs) to the system.

Using FBDIMMs, the number of pins required on the memory controller per memory channel is less, thus it is also possible to build memory controller that support more channels (4, 6 or 8) in a reasonable sized package.

So FBDIMMs enable larger memory configurations or more bandwidth. But they actually increase the memory latency since the data has to be serialized on the DIMM and deserialized when it reaches the memory controller.

FBDIMMs are a good fit for server configurations which require a lot of memory or a lot of bandwidth, but not a good fit for applications that require low memory access latency.
 
Comment

shamino

macrumors 68040
Jan 7, 2004
3,412
223
Purcellville, VA
Doctor Q said:
My intuition was completely backwards. I would have thought that parallel memory controller communication would be faster. No wonder I'm in the software field, not the hardware field!
Yes and no.

While it is true that parallel interfaces can be higher bandwidth (because there are more wires), they also suffer from issues like crosstalk (where one wire's EMI interferes with another's.)

One way around this is to go serial. USB and FireWire did this. Twisted-pair wires with differential signaling (two wires are wound together, one carrying the inverse signal of the other) works well to kill the interference. One pair for transmission and another for receiving. And since modern serial tranceivers are incredibly fast, you still get high bandwidth.

The other way is to use differential signaling and twisted pairs with a parallel bus. This is the approach taken by Ultra-2 and Ultra-3 SCSI busses. This approach ends up doubling the number of wires required, which can make cables more expensive. It also makes connectors very large (unless you use ones with micro-sized connectors, which cost a lot more.) It is also hard to route such huge numbers of wires into a circuit board and keep the entire system reliable.

It's worth noting that DVI video is effectively three serial ports. The TMDS signaling used for each DVI channel is at its core an extremely high-speed twisted-pair serial port. At DVI's maximum pixel-frequency of 165MHz per channel, and 10 transmitted bits per pixel on each channel, this serial port is moving 1.65Gbps! Or in other words, that dual-link DVI port used on a Mac G5 has a maximum bandwidth of almost 10Gbps (spread across 6 channels.) Tom's Hardware Guide has a review of video cards that describes the guts of DVI in great detail.

macshark said:
Your intuition is correct. FBDIMMs are not faster. The current bus based memory interfaces can only support 2 to 4 DIMMs at high speeds. FBDIMMs simply enable the same amount of data to be carried over smaller number of wires and thus make it possible to attach more memory (i.e. more DIMMs) to the system.
All things being equal, parallel is faster than serial, because there are more wires to carry data.

But all things are not equal. Serial interfaces are easier and cheaper to implement at high speeds. In many cases (like FireWire, DVI and PCI Express) this more than makes up for the fact that you don't have as many signal-carrying wires.

Do you actually know what the memory bandwidth of FBDIMMs are, compared to standard DDR2? I didn't see it in the above-linked article. The mere fact of it being serial does not automatically make it slower.

And, of course, if you've got a lot of channels in use at once, you can interleave your memory across them all and get all the same parallel behavior. It wouldn't suprise me if you start seeing systems with 8 sockets, where performance is maximized by installing 8 matched-size modules.
macshark said:
So FBDIMMs enable larger memory configurations or more bandwidth. But they actually increase the memory latency since the data has to be serialized on the DIMM and deserialized when it reaches the memory controller.

FBDIMMs are a good fit for server configurations which require a lot of memory or a lot of bandwidth, but not a good fit for applications that require low memory access latency.
You assume that these DIMMs will be identical to existing memory, except for the addition of the buffer. Although early product might be built this way, I would be surprised if this doesn't change.

I don't think we're seeing just another rehashing of RDRAM here.

octoberdeath said:
so good for servers... not so good for desktops?
It's anybody's guess right now. I won't be so bold as to claim one way or the other until we start seeing motherboards with these chipsets, so we can run benchmarks and comapre them against existing memory systems.

From the tech brief I read (linked above, thanks, kainjow), it has the potential to be faster than existing memory. It also has the potential to be slower. It will really depend on how this architecture is realized in actual chipsets and motherboards.

The one thing we can be certain of is that it will allow for more than the usual 2-4 sockets that systems have today, allowing for systems with tremendous amounts of RAM. Servers will definitely benefit from this, even if it ends up slower.

Desktop applications? Probably won't matter much, although cheaper motherboards will always appeal to users who have to buy their own hardware. Gamers will be upset if they end up slower, but I'd be surprised if anybody else notices.
 
Comment

ChrisA

macrumors G4
Jan 5, 2006
11,792
573
Redondo Beach, California
This is really a smart design

Doctor Q said:
My intuition was completely backwards. I would have thought that parallel memory controller communication would be faster. No wonder I'm in the software field, not the hardware field!

I could have written the above a couple years ago too but I got involved with some HW design work on a parallel cable. The primary problem with the cable was "bit skew". That is where not all the bit arive out the other end of the cable at the same time. You have to think of a wire, not as a perfect conductor but as a series of resistors, capasitors and inductors. Inother words a passive filter. The wires in a cable can never take an identical path because two wires can not be in the same physical space so always the bits don't come out the other end at the same time so device has a speed limited be the the bit skew time.

With serial you can run the data as fast as you need. At the gigaherz range if need be.

To think of it another way. on a paralel cable you can't put the next word on the input end until you _know_ the all the its have been read ou the output end. On a serial cable there is zero chance of the bit getting out of order, one bit can't take a faster path.

But this new buffered RAM does take advance of a paralel path. what they do is use multiple serial channels. The enginerinf at Intel earned their money on this one. It appears that each DIMM hasit's own serial channel so the memory bandwidth scales as you add more RAM. I hope Apple uses this.
 
Comment

Anonymous Freak

macrumors 603
Dec 12, 2002
5,406
821
Cascadia
shamino, that's one big, well written technical explanation. I do have a couple of 'suggestions', and 'clarifications', though. (None of your information is technically inaccurate, I just have ways to write them so they may be more understandable to the layman. I mean, I was motherboard 'engineering technician', and some of your phrasing is hard for ME to understand!)

shamino said:
But all things are not equal. Serial interfaces are easier and cheaper to implement at high speeds. In many cases (like FireWire, DVI and PCI Express) this more than makes up for the fact that you don't have as many signal-carrying wires.

The best example I can think of is PCI-Express vs. AGP. PCI Express is, at its core, a pair of 1-bit-wide data channels, 1 Gbps in each direction. AGP 8x is 32-bit, 66 MHz, quad-pumped. That gives them bandwidths of 250 MB/s and 2.13 GB/s, respectively. Yes, AGP blows away the newer/fancier PCI Express. Why is this? Because PCI Express is serial, AGP is parallel. But, we can add more 'lanes' to PCI Express, the standard defines up to a x32 connector! (I've never seen one even in prototype form, I think we're pretty much going to stick with x16 for now.) So we compare a x16 PCI Express slot to AGP 8x. Instead of a measly 250 MB/s, we bump up to 4 GB/s. Almost double AGP 8x's 2.13 GB/s.

shamino said:
And, of course, if you've got a lot of channels in use at once, you can interleave your memory across them all and get all the same parallel behavior. It wouldn't suprise me if you start seeing systems with 8 sockets, where performance is maximized by installing 8 matched-size modules.

I think it's probably more likely that we'll see 'multi-channel DIMMs', like what RAMBUS did right before it died. (RIMMs needed to be installed in matched pairs, so RAMBUS just came up with a new socket that WAS two channels on the one module.) As a refresher, RAMBUS' big thing was that their memory channels were only 16 bits wide (the same bus width as an individual module,) which meant that if you really wanted, you could have a single chip on a channel. No 8-chip-minimum like SDRAM (or DDR, or DDR-2.) It really was a good idea. Sadly, RAMBUS got greedy. And then stupid, crossing their patron saint (Intel,) and trying to sue other memory makers.

I haven't seen hard facts on FBDIMM, but I imagine it's similar to RAMBUS in theory, so you could, say, have 16 memory channels per module. And if the support chipset was intelligently written, it would use them as separate channels, not just clump them together again. (i.e. 16 16-bit channels instead of 1 256-bit channel.) Again for comparison, Intel's chipsets use 'dual-channel' mode (when equipped with two DIMMs,) which means that it can access both DIMMs, in different locations, simultaneously. The other type is one that just 'bundles' the channels. This is what early AMD DDR systems did, it's what the old 'install in pairs' Macs did, and, yes, the Power Mac G5. This means that these systems do access both DIMMs at the same time, but they treat it as a single 128-bit memory access, so you may get 64-bits worth of data you didn't need, where a dual-channel system would get the two 64-bit chunks you want. It's not a HUGE difference, but it is a difference.

And now that I think further on it, it does seem like they are trying to get the benefit of RAMBUS without the negatives (not just the controlling company, but also doing without 'every slot must be filled' like RAMBUS needed with their "continuity RIMM".)

shamino said:
I don't think we're seeing just another rehashing of RDRAM here.

Almost assuredly not. RAMBUS doesn't own the standard! (I worked at Intel during the RAMBUS fiasco, in the server motherboard department. We scrapped an entire line of motherboards because of the RAMBUS crap.)
 
Comment

shamino

macrumors 68040
Jan 7, 2004
3,412
223
Purcellville, VA
ehurtley said:
shamino, that's one big, well written technical explanation. I do have a couple of 'suggestions', and 'clarifications', though. (None of your information is technically inaccurate, I just have ways to write them so they may be more understandable to the layman. I mean, I was motherboard 'engineering technician', and some of your phrasing is hard for ME to understand!)
Thanks for the clarifications. I am not a hardware engineer. I'm a software developer by trade, and know what I know about hardware from reading articles, friends who do develop hardware, and experience with high-end equipment from companies like Sun. So I'm not surprised that my terminology is not always industry standard.
ehurtley said:
Shamino said:
I don't think we're seeing just another rehashing of RDRAM here.
Almost assuredly not. RAMBUS doesn't own the standard! (I worked at Intel during the RAMBUS fiasco, in the server motherboard department. We scrapped an entire line of motherboards because of the RAMBUS crap.)
Actually, in this case, I was referring to the fact that RDRAM got its peak performance with access patterns that are mostly sequential. With access patterns that are more random, performance dropped off. At least this is what I remember about it.

This made it great for servers, but lousy for gaming/media applications.

The comments I was referring to are claiming that the same will be the case with FBDIMMs. This might be true, but I don't think it's possible to know based solely on the small descriptions we've seen here today.

I will be very interested to see benchmarks that compare it against traditional DDR2. If we're lucky, perhaps we'll be able to find a pair of motherboards that are almost identical, except for the RAM tech, so a fair comparison can be made.
 
Comment

Anonymous Freak

macrumors 603
Dec 12, 2002
5,406
821
Cascadia
hehe.. oops, didn't realize you were talking technology-wise...
:eek:

shamino said:
I will be very interested to see benchmarks that compare it against traditional DDR2. If we're lucky, perhaps we'll be able to find a pair of motherboards that are almost identical, except for the RAM tech, so a fair comparison can be made.

Based on Intel's leanings, I'd guess it they will release two workstation chipsets that are almost identical, except for the RAM tech. (Server will probably only get FB-DIMM, other than servers running on workstation chipsets, and desktop will likely take longer to switch to FB-DIMM, if at all. Who knows, it might be too ridiculously expensive for 'normal desktop' use for which the enthusiasts will just go to workstation boards.)
 
Comment

bigandy

macrumors G3
Apr 30, 2004
8,852
0
Murka
I'd like to see more ram available in the powermac.. not because i can afford it, but just so i can lust after it. :cool:
 
Comment

ChrisA

macrumors G4
Jan 5, 2006
11,792
573
Redondo Beach, California
Now days RAM is really a back end to a cache. We never pull bytes out of RAM we pull "cache lines" out. So if the RAM is multi channel (either one channel per DIMM or multi channles per DIMM) then you could fill multiple cache lines simultainoiusly. Intels new CPU chips will all be going multi-core. We have two cores now but I'm sure 4 and 8 core chips will come. These are using a "shared cache" so with 8 cores running you would need _huge_ cache-to-RAM bandwidth with the abilty to fill many cache lines simultainoiusly. With this new system it looks like bandwidth goes up as you add DIMMs so I can envision configuration guidlines like "Best to use one DIMM per CPU core"

For tasks like video editing I think RAM bandwidth is the bottleneck. A video frame can be about 1MB in size so even a 1 second clip can't fit in L2 cache. When you "scrub" the video you're haulting loads on data from RAM to cache and then into the video RAM. Simply playing video is easy but editing means jumpping all over the clip and playing it at up to 100X real time
 
Comment

shamino

macrumors 68040
Jan 7, 2004
3,412
223
Purcellville, VA
simie said:
The downside will be the cost of them. They won't come cheap.
Not at first. But if they become popular, the price will quickly come down.

Remember when DDR2 was brand new and was very expensive?
 
Comment

WildPalms

macrumors 6502a
Jan 4, 2006
995
2
Honolulu, HI
ChrisA said:
I could have written the above a couple years ago too but I got involved with some HW design work on a parallel cable. The primary problem with the cable was "bit skew". That is where not all the bit arive out the other end of the cable at the same time. You have to think of a wire, not as a perfect conductor but as a series of resistors, capasitors and inductors. Inother words a passive filter. The wires in a cable can never take an identical path because two wires can not be in the same physical space so always the bits don't come out the other end at the same time so device has a speed limited be the the bit skew time.

With serial you can run the data as fast as you need. At the gigaherz range if need be.

To think of it another way. on a paralel cable you can't put the next word on the input end until you _know_ the all the its have been read ou the output end. On a serial cable there is zero chance of the bit getting out of order, one bit can't take a faster path.

But this new buffered RAM does take advance of a paralel path. what they do is use multiple serial channels. The enginerinf at Intel earned their money on this one. It appears that each DIMM hasit's own serial channel so the memory bandwidth scales as you add more RAM. I hope Apple uses this.

Good work Chris, nice explanation. :)
 
Comment

hassiman

macrumors member
Aug 30, 2006
94
1
San Diego
FB-DIMMS gone already?

Picked this up on another forum... anyone confirm?:confused:

"I've read on several online tech sites that Intel will not be using FB-DIMMS on their new server chipset. Since Intel's consumer chipsets were never planned to go to FB-DIMMS, this means Intel is effectively abandoning FB-DIMMS. AMD has apparently also made a decision not to use FB-DIMMS on their future chips. So, it would appear that the Mac Pro's memory architecture is a dead end. This reminds me of RAMBUS memory for the PC some years back (my current desktop PC, soon to be retired, has RAMBUS memory in it). "
 
Comment
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.