32gb of RAM in 2009 Quads!

Discussion in 'Mac Pro' started by kxfrog, Dec 22, 2009.

  1. kxfrog macrumors regular

    Joined:
    Aug 9, 2009
    Location:
    UK
    #1
    OWC are now selling 32gb and 24gb kits for our lowly 2009 quad core mac pros we can now have as much ram as an 8 core all be it using 8 gb memory modules. link here.
    http://eshop.macsales.com/item/Other World Computing/85MP3S8M32GK/

    Also on OWC's blog they are reporting the current 8 core mac pros are able to use the 8gb modules for a total of 64gb of ram!
     
  2. ncc1701d macrumors 6502

    Joined:
    Mar 30, 2008
    #2
    2G's for all that goodness :D must ... save.... money....
     
  3. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
  4. AZREOSpecialist macrumors 68000

    AZREOSpecialist

    Joined:
    Mar 15, 2009
  5. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #5
    Pricey, but to be expected with 8GB sticks, and RDIMM at that (can't be UDIMM, as it won't work with their own 4GB UDIMM's ;)).

    Yes, but not like the standard non-ECC DDR3 though. There's just not as much made, as it's typically only sold to the enterprise market.
     
  6. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #6
    You can save some money by using two 4 GB RDIMMs and two 8 GB RDIMMs. You will not even loose bandwidth with it.
     
  7. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #7
    Did you run some tests that show the tri-channel interleaving works in this configuration? Did I miss that?
     
  8. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #8
    I ran tests with 1 GB and 2 GB UDIMMs in my octad. Slots 1, 2, 5, 6 with 2 GB and 3, 4, 7, 8 with 1 GB versus 2 GB in slots 1, 2, 3, 5, 6, 7. I only did Geekbench but it is supposed to be bandwidth sensitive. As the Xeon 3500/5500 are specified for UDIMM and RDIMM I have no doubt that the principle will work for 4/8 RDIMMs as well.

    I'm sure there would be significant bandwidth loss if I could run my memory on 1333 MHz. My W5590 are well capable of it but EFI will not let me. If I had the 10x multiplier the bandwidth would drop when you use two slots instead of one per channel. Since Apple has screwed it up anyway I can just as well take advantage of the lower price of the 1 GB UDIMMs or the 4 GB RDIMMs. I'm now running this mixed mode.
     
  9. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #9
    Did you post those results? I'd be interested to compare.

    I must admit that I think Geekbench is terrible at determining memory bandwidth. Here's a thread where I tried to benchmark my tri-channel setup with it and it reported a stream copy of 5GB/s :eek: compared to Sisoft (18GB/s) and Everest (14GB/s) under Windows...

    http://forums.macrumors.com/showthread.php?t=729368

    Theoretical memory bandwidth with tri-channel 1066 is 25GB/s so Geekbench doesn't come remotely close to saturating our memory architecture.
     
  10. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #10
    If I find the time I can run the test again with other benchmarks. It just seems pretty obvious that in the MacPro4,1 there will be little difference if you fit half the memory into each of the two slots 3 and 4 that connect to the memory channel 3. Why should it be slower than having the whole capacity in the slot 3?
     
  11. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #11
    It would be cool to get to the bottom of this. I don't think dual vs. triple channel makes much difference at all in real-world performance... it would only be measurable in benchmarks, and then only a few which can saturate this kind of architecture. However, it would be nice to determine how Intel's memory controller handles this situation. While your assumption makes a lot of sense, it's also possible that it simply defaults to dual channel mode no matter what the actual memory configuration is when all four DIMM slots are occupied.
     
  12. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #12
    For most current software, it won't matter, as there's very little that can actually use enough bandwidth to need triple channel. But there is some. Most is server based (and there's not a massive amount here either), but for workstation use, it's in areas such as large scale simulations (medical, weather,...).

    As far as filling the 4th DIMM, it's my understanding the IMC does default to dual channel mode. It would be intersting to find out if that's different though. :) I just don't have a 4th DIMM to test it myselft right now. :eek:
     
  13. AZREOSpecialist macrumors 68000

    AZREOSpecialist

    Joined:
    Mar 15, 2009
    #13
    4 GB DIMMs were quite pricey when the 2009 Mac Pro was announced. Weren't 4 of them over $1,000 initially? By July and August, that price had fallen to around $600 for four modules. I think the same will happen here.
     
  14. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #14
    To some extent, yes. But there's less demand for the largest sticks. As the 8GB versions of RDIMM are currently the largest capacity, they aren't likely to fall quite as much until the 16GB sticks arrive (announced some time ago, but not shipping yet AFIAK). Those may not show until the Xeon 56xx based servers are available.
     
  15. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #15
    No, this is neither logical nor sensible. If you fill the slots 3 and 4 with the same DIMMs as the slots 1 and 2 you obviously unbalance the third channel and may cause the IMC to default to dual channel memory. I would agree with that.

    If you fit only half the capacity into slots 3 and 4 you end up with exactly the same memory capacity for each channel. This is the same as leaving the slot 4 empty and fitting the same DIMMs to the slots 1-3. If there is any penalty at all it should be absolutely minimal compared to an unbalanced mode.

    If you want me to run a test with the UDIMMs you need to tell me which free software you consider to be conclusive. I can run Win7 or OS X apps.
     
  16. Spanky Deluxe macrumors 601

    Spanky Deluxe

    Joined:
    Mar 17, 2005
    Location:
    London, UK
    #16
    Is it really worth it??

    Mac Pro Quad 2.66Ghz $2499
    Upgrade to 32GB of RAM $1979.99
    Total: $4478.99

    Mac Pro Octo 2.26Ghz $3299
    Upgrade to 32GB of RAM $1199.99
    Total: $4498.99
     
  17. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #17
    I believe what happens is the interleaving is engaged for all three channels, even if only one has both slots filled (it can't selectively interleave just certain channels that have the additional DIMM/s, as some boards have more than a pair of slots per channel).
     
  18. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #18
    Let us restrict the discussion to Mac Pros. They all have only one slot per channel 1 and 2 and two slots per channel 3.

    As I have previously pointed out there is also no multiplier penalty for using two slots per channel because Apple has already castrated the high performance IMCs to 1066 MHz.
     
  19. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #19
    I'm thinking in terms with the second DIMM in channel 3, interleaving is activated on all of them (even though there's not one there for slots 1 & 2).
     
  20. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #20
    I wonder why there should be interleaving at all. The memory controller is addressing exactly the same amount of memory cells with the same interface. The I/O process should be the same without an additional serialization.

    Are you sure your concept of interleaving is actually happening in reality? I do not know enough about the architecture and the protocol of the memory channel to say it interleaves or not.

    Let us say it does, then there may still be enough slack in the protocol to let the total process run in the same time frame. Let us assume we have an HP or a Sun workstation with dual slots per channel. It would not have any bandwidth reduction if all six slots are filled with 1066 MHz memory compared to three slots filled with 1066 MHz memory. Only when it uses 1333 MHz memory the bandwidth would be reduced for six slot use versus three slot use because the controller would step the frequency down to 1066 MHz. At least that is what I read in the Intel literature about the 5500 IMC.
     
  21. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #21
    I agree that what you say, could work, and it's also how I would make it work if it was up to me... but it's not consistent with Intel's (albeit somewhat vague) documentation...

    Intel's own X58 desktop single socket reference motherboard, also uses 4 DIMM slots and here's how they describe the operation...

    http://downloadmirror.intel.com/18128/eng/DX58SO_TechProdSpec.pdf (pg 16)

    They seem to make it clear that Tri-channel mode is only engaged in the unique case of having three identically matched memory modules in each of three memory channels.

    I'm not sure of the layout of the DIMM slots in the Mac Pro, but isn't it Channel A that has two DIMM slots while B and C have only one?

    Finally, I would use Sisoft Sandra or Everest's memory bandwidth tests (in Windows) to determine the real single, dual, and tri-channel memory bandwidth and then try mixing them like you do and see what performance you get with that combo... it should match one of the known single, dual or tri-channel measurements thus removing any ambiguity
     
  22. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #22
    Leave out the words "only" and "unique" and your statement is true. Intel make no reference to the configuration I use.

    No, the first two channels have one DIMM slot and the third channel has two. Read the manual.

    I will do something along those lines in the next days.
     
  23. gugucom macrumors 68020

    gugucom

    Joined:
    May 21, 2009
    Location:
    Munich, Germany
    #23
    Test carried out with Windows7-64 Lavalys Everest home edition Ver. 2.20.405

    Config1: 2GB UDIMMs in slots 1, 2, 5, 6 and 1 GB UDIMMs in slots 3, 4, 7, 8
    Config2: 2GB UDIMMs in slots 1, 2, 3, 5, 6, 7

    write C1: 6526, 6552, 6545 MB/s Av. 6541
    write C2: 6534, 6615, 6535 MB/s Av. 6561 delta 0,31%

    read C1: 10.031, 10.026, 10.018 MB/s Av. 10.025
    read C2: 10.076, 10.095, 10.065 MB/s Av. 10.079 delta 0,54%

    latency C1: 12,7 ns
    latency C2: 12,7 ns

    Let's examine those results. I have run the test three times in both configurations. With six 2 GB UDIMMs writing is 0,31% and reading is 0,54% faster compared to a mixed config from four 2 GB and four 1 GB UDIMMs. Both configs have latencies of 12,7 ns.

    A bandwidth difference of half a percent is absolutely negligible under real world conditions. The mixed mode is much better for upgrades because it lets you buy just 2 DIMMs of each kind for successive upgrades. In the case of the RDIMMs I do not expect to see different results for the comparison of respective configurations. With RDIMMs it is particularly usefull to buy only two of the expansive 8 GB RDIMMs.
     
  24. nanofrog macrumors G4

    Joined:
    May 6, 2008
    #24
    In this case, the interleaving is nothing more than a switch. But when active (more than 1 DIMM in any of the channels), they all have to engage to keep the data flow correct (properly syncronized). Unfortunately, it adds latency, and is why even in triple channel mode, the memory throughput does slow down as additional DIMMs are addedd (up to 9 = 3 per channel are actually allowable in the IMC).
     
  25. VirtualRain macrumors 603

    VirtualRain

    Joined:
    Aug 1, 2008
    Location:
    Vancouver, BC
    #25
    This does look encouraging... but just to remove any last bit of skepticism I would still encourage you to run this test with sticks only in 1, 2, 5, 6 just to make sure that dual-channel performance is 33% less than what you are seeing to ensure the test is accurately reflecting tri-channel performance in the first place.

    Here's my result from a few months ago with 3x2GB in my quad. Note that my write is about 50% higher than yours but my latency is a LOT higher than yours. :confused:

    [​IMG]
     

Share This Page