Are my heatsink retainers broken? cMP 2009

Discussion in 'Mac Pro' started by Horselover Fat, Jun 6, 2018.

  1. Horselover Fat macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #1
    Since I upgraded my 2009 MP from 2x5660s to 2x5690s (and freed the machine from dust) I have been paying attention to its temperatures via Macs Fan Control (MFC). I learned that usually the Northbridge (IOH Diode) is always hottest. And sometimes the retainers of the NB heatsink can be broken/loose so that efficient cooling is impossible and the temperature will rise to dangerous levels.

    My questions are:

    1. Which NB temperature (idle, without intervention from MFC) suggests that the retainers are broken? Mine is around 75 °C (one report claimed 127 °C before and 60 °C after the repair).

    2. Which NB temperature should not be exceeded?

    Why I am asking: Recently, I have had occasional freezes, maybe one per day. I could move the mouse pointer but clicking or pressing a key would not have any effect. I think, not even a beachball appeared. And seconds later all clicks and presses played out at once. At that time I used MFC and the intake & exhaust fans to keep NB temp at around 68° C. Now MFC is set to constant 900 rpm (intake&exhaust) and I get 63 °C NB temp. No freezes so far. But the 900 rpm are a bit too loud for me. The other temperatures should be fine. What do you think?

    3. Could the freezes have a different cause? I did not change anything substantial in my system apart from swapping the CPUs and upgrading to HS.

    Thank you
     

    Attached Files:

  2. tu2thepoo, Jun 6, 2018
    Last edited: Jun 6, 2018

    tu2thepoo macrumors member

    Joined:
    Nov 14, 2017
    #2
    My NB diode usually reads about that temperature range (60-75C), so I don't think yours is too hot. Whenever I've had a computer with symptoms like you describe (a pause, then all keys/commands applied at once), it was usually something locking up the disk subsystem.

    To identify a software cause, check Activity Monitor for anything that's periodically scanning or saving large files to disk*. If it's hardware, my first guess would be a loose SATA cable or RAM module (test that by unplugging everything and plugging back in). More rarely it could be a failing hard disk** or bad RAM.

    *Look for anything that's eating up more than ~80-90% of a CPU core, like a photo database or video editing program updating the cache. If it happens right before/after a freeze, you've probably found whatever activity is associated with the lockup.

    **judging from your screenshot it looks like you have a handful of older WD blue/black drives. If they're more than 3-4 years old I'd suggest running without each one for a day or two and see if the pauses/freezes stop. do it one at a time to isolate variables.
     
  3. h9826790 macrumors G4

    h9826790

    Joined:
    Apr 3, 2014
    Location:
    Hong Kong
    #3
    1) Your NB temperature is normal. Basically, anything below 90C with idle fan is normal.

    2) Anything between 55-85C after the Mac warmed up under normal ambient condition.

    3) Most likely something else but not NB related.
     
  4. Horselover Fat, Jun 6, 2018
    Last edited: Jun 6, 2018

    Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #4
    Thank you for your detailed reply, tu2thepoo. I'd like to be more specific about the temperatures. 60-75 °C seems ok for you. But I have 75 °C idle and without MFC. This is your max. temp at my lowest CPU use. So it might rise at demanding times. I guess the question is, can I be sure that the fans kick in automatically (without MFC, that is) to keep the temperature at a healthy level and just forget about watching MFC and fiddling around with fan speeds?

    Interesting that you mention the drives. I just have tried SMART Utility and it says about all of them: 0 errors of any kind and gives them a PASS. However, indeed some of my drives are not exactly new. The 640 GB one is still the first one from when I bought the machine. Could a drive cause the freezes even when they pass the SMART test?

    You mention the RAM. I use lidded CPUs in a dual CPU MP 2009, so I must not fasten the screws too tight. When I put in the 5690s I had to recalibrate the heatsink screws several times cautiously because some RAM sticks were nor recognized. Could it be that still some screw is not in the right position? However, the freezes did not occur until recently and the CPU swap is about half a year old.

    I will definitely keep an eye on activity monitor but I'm afraid I will not have the chance to switch to it during the freeze.
    --- Post Merged, Jun 6, 2018 ---
    Thank you, h9826790. I hope to find out that it's a software issue I can solve.
     
  5. mattspace macrumors 65816

    mattspace

    Joined:
    Jun 5, 2013
    Location:
    Australia
    #5
    I've got dual delidded 5675s in my 2009, 3x16gb ram, 4 spinners, an ssd in the second optical bay, ethernet, wifi and bluetooth active, and my I/O Hub Tdoide is rock steady on 77 celsius, regardless of what the machine is doing. it takes a couple of minutes to come up to temperature, but i've never seen it go over.
     

    Attached Files:

  6. Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #6
    Thank you. Given your 77 °C and the fact that I had another freeze two hours ago at 63 °C suggests that it might be something else.

    I switched to activity monitor and accidentally DID see the services photolibraryd and photoanalysisd using up 90 %, but only for a moment and that was not during the freeze. The freeze wouldn't allow me to switch to activity monitor.
     
  7. h9826790 macrumors G4

    h9826790

    Joined:
    Apr 3, 2014
    Location:
    Hong Kong
    #7
    photolibraryd and photoanalysisd analysis your photos when Photo apps is OFF. So, that's very normal if they working in the background (unless there is no photos at all, or you even never open this apps).

    Since the only thing you changed is basically just the CPU. So, I think we better focus on the CPU first. (unless you accidentally damaged / loosed something during the upgrade / dust removal)

    I suggest you open MacsFanControl. So that you can monitor all the temperatures.

    And then run some CPU stress test. Make sure even when all 12 cores 24 threads are stress to 100%, the temperature still normal. And no freeze.
     
  8. Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #8
    I have just done a stress test for 10 minutes: 24 threads of yes command in terminal. MFC has had all fan speeds set to automatic. It's been scary to watch CPU A core rise to 99.0 °C before the fans kicked in. I guess that's normal? For the rest of the test CPU A core remained the hotspot with 93 °C. MFC image is attached. Afterwards the temperatures and fan speeds came down quickly. No freezes. Ironically, from all the fans blowing during the test the NB temp was lower than afterwards when it was back to idle and 75 °C (when in automatic mode).
    --- Post Merged, Jun 6, 2018 ---
    Can we assume it's not the CPUs?
     

    Attached Files:

  9. h9826790 macrumors G4

    h9826790

    Joined:
    Apr 3, 2014
    Location:
    Hong Kong
    #9
    Ignore that PCECI temperature, MFC doesn't get that right.

    You only need to focus on the Diode temperature.

    The native fan profile usually keep the CPU between 83-85C when under full stress. And let the fan stay at idle until your CPU is around 80C. So, your CPU temperatures and SMC looks fine.

    If temperature is fine. And the computer won't hang during stress. Than I don't think it's CPU issue.

    The easiest way is to boot to another OS. Windows is the best, if not available, then just burn a LINUX LiveCD, then you are good to go. If no freeze in other OS, then it's almost 100% sure that's software issue.
     
  10. Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #10
    I have a Windows 10 installation to play some games occasionally and will report back whether a freeze occurs there or not. It's too early to guess potential software reasons, isn't it?
     
  11. h9826790 macrumors G4

    h9826790

    Joined:
    Apr 3, 2014
    Location:
    Hong Kong
    #11
    Hard to tell, but for 2009 dual processor cMP, I really don't want to ask you to try re-seat the lidded CPUs unless absolutely required.
     
  12. tu2thepoo macrumors member

    Joined:
    Nov 14, 2017
    #12
    SMART isn't a 100% reliable test, because each drive manufacturer sets their own thresholds for a pass/fail parameter, and some measure differently than others. For example, WD drive may "fail" a test that a Seagate drive would mark "pass", or vice versa - and SMART utility may not recognize the difference. A "PASS" result usually means "this drive won't immediately burst in flames" but you can't infer much more than that.

    If you see the same pauses in Win10, the simplest (but not cheapest!) way to test would be to buy another hard disk (1TB or whatever), and clone your mac and windows disks. Clone and swap one drive at a time, and see if the pauses go away.

    If yes, then great! You found the failing drive.
    If no, then great! You have an extra hard drive :)

    As h9826790 said, if you do NOT see the pauses in Windows 10 then it's very likely a software issue instead.
     
  13. adam9c1 macrumors 68000

    adam9c1

    Joined:
    May 2, 2012
    Location:
    Chicagoland
    #13
    Take a look at an app called DriveDX.
    It has a 10 or so day trial.
     
  14. Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #14
    It's been a while but I made a discovery today. A few minutes ago there was a long freeze like never before, maybe half a minute or so. Beachball stopped turning. At the end I managed to quickly start activity monitor and for a split second I could see 110 % kernel task. This article (German) sees the cause in overheating. When the fans cannot cool the components kernel jumps in and throttles/halts CPUs. However the fans didn't kick in. https://www.macwelt.de/a/systembremse-erklaert-wenn-kernel-task-das-system-bremst,3438750
    On top of that the attached temperatures are nothing to worry about, I've learned in the past. No freezes in Win 10 so far, but I'm not using it extensively. So I'm still undecided whether it's a hardware or software problem.
     

    Attached Files:

  15. h9826790, Aug 24, 2018
    Last edited: Oct 24, 2018

    h9826790 macrumors G4

    h9826790

    Joined:
    Apr 3, 2014
    Location:
    Hong Kong
    #15
    Temperatures are good.

    Sounds like software issue.

    110% kernel task usually means some system core process is CPU single core limiting, and you have to wait for that to complete.
     
  16. Horselover Fat thread starter macrumors regular

    Joined:
    Feb 2, 2012
    Location:
    Germany
    #16
    You’re probably right. There have been no more freezes since the update to Mojave, which has been a month now. Who knows what Apple changed but I consider this resolved.
     
  17. startergo macrumors 6502a

    startergo

    Joined:
    Sep 20, 2018
    #17
    I brought this from another similar thread. I do the fans based on the diode temperature and the temp drops down to 60, which is perfect for me. The intake and exhaust run at 1400 and I don't hear the noise (perhaps I am deaf:))
    --- Post Merged, Oct 25, 2018 ---
    I don't think running the fans at constant speed is a good idea. What happens when the processing demand requires more power? You already fixed the fans speed.
     

Share This Page