Become a MacRumors Supporter for $25/year with no ads, private forums, and more!

MP 1,1-5,1 cMP 4,1 (5,1) crashing after CPU upgrade

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
My cMP crashes/restarts when the CPU is under heavy stress. I’ve installed two matched sets of X5680s in my machine and both pairs yield the same results. Things appear to be working normally until CPU is under heavy load, like the fin 10% of a Geekbench test (although some reboots seem to be random, and some are immediately after logging in following a crash).

I use Macs Fan Control to monitor fan speeds and temps and even with fans cranked up to max it’s still crashing even though it’s reading a high of 145 F for each CPU and the Northbridge Diode at 113 F.

My current specs:
  • Mac Pro 2009 4,1 flashed to 5,1
  • 2 x X5680 3.33 GHz
  • 32 GB 1066 RAM
  • Nvidia GTX 970 4GB
  • USB 3 card
  • WiFi + BT card
  • 120 GB SSD boot drive
  • 4 TB HD
  • 640 GB HD

Everything works fine when I install the original 2.26 CPUs so I’m fairly confident it’s an issue with the new CPUs. I’ve gone down a ton of rabbit holes and it seems like it could be that the X5680s draw more power than the original CPUs and some people find that X5675s are more reliable as they draw nearly the same power as the original 2.26s. Does that seem like a likely solution? I’ve sunk quite a bit of money and time into this thing and seem to always be barking up the wrong tree. The other possibilities I’ve heard is replacing the logic board or the power supply which I’m hoping to avoid.
 

rx78

macrumors member
Nov 10, 2020
68
10
I've had a similar problem after I upgraded to dual x5680. Great when doing normal tasks, but on load (eg, gaming, stress testing) it will crash to a black screen, fans on max. The fans would have been on 100% with CPU A hitting the 90s!

I had to redo the CPU paste and tighten the heatsinks and it seems to have fixed it. Maybe have a look at the heatsinks and check they are torqued enough.
 
  • Like
Reactions: KeesMacPro
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
My cMP crashes/restarts when the CPU is under heavy stress. I’ve installed two matched sets of X5680s in my machine and both pairs yield the same results. Things appear to be working normally until CPU is under heavy load, like the fin 10% of a Geekbench test (although some reboots seem to be random, and some are immediately after logging in following a crash).

I use Macs Fan Control to monitor fan speeds and temps and even with fans cranked up to max it’s still crashing even though it’s reading a high of 145 F for each CPU and the Northbridge Diode at 113 F.

My current specs:
  • Mac Pro 2009 4,1 flashed to 5,1
  • 2 x X5680 3.33 GHz
  • 32 GB 1066 RAM
  • Nvidia GTX 970 4GB
  • USB 3 card
  • WiFi + BT card
  • 120 GB SSD boot drive
  • 4 TB HD
  • 640 GB HD

Everything works fine when I install the original 2.26 CPUs so I’m fairly confident it’s an issue with the new CPUs. I’ve gone down a ton of rabbit holes and it seems like it could be that the X5680s draw more power than the original CPUs and some people find that X5675s are more reliable as they draw nearly the same power as the original 2.26s. Does that seem like a likely solution? I’ve sunk quite a bit of money and time into this thing and seem to always be barking up the wrong tree. The other possibilities I’ve heard is replacing the logic board or the power supply which I’m hoping to avoid.

Did you check and repaste the NorthBridge heatsink?
The NB rivets are known to break after all these years, resulting in high NB diode Temp, especially after upgrading the CPUs.

Could you post a screenshot of MacsFanControl at idle and at load?
 
Last edited:
Comment

amstel78

macrumors 6502
Aug 12, 2018
324
114
X5680's should work fine as I have 2 x5690's in my 5,1. The TDP for both is 135W which is higher than stock. That said, what does logs report as previous shutdown cause?
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
Did you check and repaste the NorthBridge heatsink?
The NB rivets are known to break after all these years, resulting in high NB diode Temp, especially after upgrading the CPUs.

Could you post a screenshot of MacsFanControl at idle and at load?

How do I tell if the NB rivet has broken down?

Here's a screenshot under stress during a GeekBench 5 run. Note, these are the last readings before the crash.
Screen Shot 2021-02-23 at 9.23.08 AM.png


This screenshot is when the machine is idle, fans running on Automatic settings.
IMG_3807.JPG
 
Last edited:
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
X5680's should work fine as I have 2 x5690's in my 5,1. The TDP for both is 135W which is higher than stock. That said, what does logs report as previous shutdown cause?
This is the most recent crash log. It's always showing 9 error-reporting banks, then listing the processors. I've kept record of panic reports during this upgrade and I had the same type of crash logs with the last pair of X5680s that I tried as well.

*** Panic Report ***
Machine-check capabilities: 0x0000000000001c09
family: 6 model: 44 stepping: 2 microcode: 31
signature: 0x206c2
Intel(R) Xeon(R) CPU X5680 @ 3.33GHz
9 error-reporting banks
Processor 12: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 13: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 14: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 15: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 16: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 17: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 18: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 19: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 20: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 21: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 22: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
Processor 23: IA32_MCG_STATUS: 0x0000000000000004
IA32_MC6_STATUS(0x419): 0xfe2000800030014a
IA32_MC6_ADDR(0x41a): 0x00000005ef123940
IA32_MC6_MISC(0x41b): 0x0000000000031814
IA32_MC8_STATUS(0x421): 0xb20000000004008f
mp_kdp_enter(): 12582911, 1, 24 TIMED-OUT WAITING FOR NMI-ACK, PROCEEDING
panic(cpu 22 caller 0xffffff80073871da): "Machine Check at 0x0000000105b67314, registers:\n" "CR0: 0x0000000080010033, CR2: 0x000000015a2a7000, CR3: 0x0000000737b3b014, CR4: 0x00000000000226e0\n" "RAX: 0x000000000001ffff, RBX: 0x0000000000000002, RCX: 0x00000000000002d8, RDX: 0x00007fce2882bb00\n" "RSP: 0x0000700005a1f880, RBP: 0x0000700005a1f960, RSI: 0x00007fce2890a0a0, RDI: 0x00007fce2882bb00\n" "R8: 0x00000000000002d8, R9: 0x000000010a323271, R10: 0x00007fce2882bb00, R11: 0x00007fce2890a0a0\n" "R12: 0x00007fce2882c2b0, R13: 0x00000000000002d8, R14: 0x00000000000002d8, R15: 0x0000000105b68e60\n" "RFL: 0x0000000000000206, RIP: 0x0000000105b67314, CS: 0x000000000000002b, SS: 0x0000000000000023\n" "Error code: 0x0000000000000000\n"@/BuildRoot/Library/Caches/com.apple.xbs/Sources/xnu/xnu-4570.71.82.6/osfmk/i386/trap_native.c:168
Backtrace (CPU 22), Frame : Return Address
0xffffffa3b8a03c00 : 0xffffff800726ae56
0xffffffa3b8a03c50 : 0xffffff8007394434
0xffffffa3b8a03c90 : 0xffffff8007386604
0xffffffa3b8a03d00 : 0xffffff800721ce60
0xffffffa3b8a03d20 : 0xffffff800726a8cc
0xffffffa3b8a03e50 : 0xffffff800726a68c
0xffffffa3b8a03eb0 : 0xffffff80073871da
0xffffffa3b8a03fa0 : 0xffffff800721d6df

BSD process name corresponding to current thread: geekbench_x86_64

Mac OS version:
17G14033

Kernel version:
Darwin Kernel Version 17.7.0: Mon Aug 31 22:11:23 PDT 2020; root:xnu-4570.71.82.6~1/RELEASE_X86_64
Kernel UUID: 92BEC910-BBAA-3192-BB57-39712C7D3342
Kernel slide: 0x0000000007000000
Kernel text base: 0xffffff8007200000
__HIB text base: 0xffffff8007100000
System model name: MacPro5,1 (Mac-F221BEC8)

System uptime in nanoseconds: 711130045577
last loaded kext at 246200992876: com.apple.filesystems.msdosfs 1.10 (addr 0xffffff7f8a754000, size 69632)
last unloaded kext at 307217751888: com.apple.filesystems.msdosfs 1.10 (addr 0xffffff7f8a754000, size 61440)
loaded kexts:
com.nvidia.CUDA 1.1.0
com.nvidia.web.GeForceWeb 10.3.3
com.nvidia.web.NVDAGM100HalWeb 10.3.3
com.nvidia.web.NVDAResmanWeb 10.3.3
com.newer-tech.kext.nwtmem 1.0.0
com.apple.driver.AudioAUUC 1.70
com.apple.driver.AppleTyMCEDriver 1.0.3d2
com.apple.driver.AGPM 110.23.37
com.apple.filesystems.autofs 3.0
com.apple.driver.AppleMikeyHIDDriver 131
com.apple.AGDCPluginDisplayMetrics 3.20.18
com.apple.driver.AppleUpstreamUserClient 3.6.5
com.apple.driver.AppleMCCSControl 1.5.5
com.apple.driver.AppleHV 1
com.apple.driver.AppleHDA 281.52
com.apple.iokit.IOUserEthernet 1.0.1
com.apple.driver.AppleMikeyDriver 281.52
com.apple.iokit.IOBluetoothSerialManager 6.0.7f22
com.apple.driver.pmtelemetry 1
com.apple.Dont_Steal_Mac_OS_X 7.0.0
com.apple.driver.AppleLPC 3.1
com.apple.driver.AppleIntelSlowAdaptiveClocking 4.0.0
com.apple.driver.AppleOSXWatchdog 1
com.apple.driver.ACPI_SMC_PlatformPlugin 1.0.0
com.apple.filesystems.apfs 748.51.0
com.apple.iokit.SCSITaskUserClient 404.30.3
com.apple.filesystems.hfs.kext 407.50.6
com.apple.AppleFSCompression.AppleFSCompressionTypeDataless 1.0.0d1
com.apple.BootCache 40
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0
com.apple.AppleSystemPolicy 1.0
com.apple.driver.AppleFWOHCI 5.5.9
com.apple.driver.Intel82574LEthernet 2.7.2
com.apple.driver.AirPort.Brcm4331 800.21.30
com.apple.driver.AppleAHCIPort 329.50.2
com.apple.driver.AppleHPET 1.8
com.apple.driver.AppleRTC 2.0
com.apple.driver.AppleACPIButtons 6.1
com.apple.driver.AppleSMBIOS 2.1
com.apple.driver.AppleACPIEC 6.1
com.apple.driver.AppleIntelCPUPowerManagementClient 220.50.1
com.apple.driver.AppleAPIC 1.7
com.apple.nke.applicationfirewall 186
com.apple.security.TMSafetyNet 8
com.apple.security.quarantine 3
com.apple.driver.AppleIntelCPUPowerManagement 220.50.1
com.apple.driver.AppleHIDKeyboard 205.1
com.apple.driver.IOBluetoothHIDDriver 6.0.7f22
com.apple.kext.triggers 1.0
com.apple.iokit.IOAVBFamily 683.1
com.apple.plugin.IOgPTPPlugin 680.15
com.apple.iokit.IOEthernetAVBController 1.1.0
com.apple.driver.DspFuncLib 281.52
com.apple.kext.OSvKernDSPLib 526
com.apple.iokit.IOAcceleratorFamily2 378.28
com.apple.driver.AppleSSE 1.0
com.apple.iokit.IOSurface 211.15
com.apple.iokit.IOSerialFamily 11
com.apple.iokit.IONDRVSupport 519.21
com.apple.driver.AppleSMBusController 1.0.18d1
com.apple.driver.AppleHDAController 281.52
com.apple.iokit.IOHDAFamily 281.52
com.apple.iokit.IOAudioFamily 206.5
com.apple.vecLib.kext 1.2.0
com.apple.AppleGPUWrangler 3.20.18
com.apple.AppleGraphicsDeviceControl 3.20.18
com.apple.iokit.IOGraphicsFamily 519.23
com.apple.driver.AppleSMBusPCI 1.0.14d1
com.apple.iokit.IOFireWireIP 2.2.9
com.apple.iokit.IOSlowAdaptiveClockingFamily 1.0.0
com.apple.driver.IOPlatformPluginLegacy 1.0.0
com.apple.driver.IOPlatformPluginFamily 6.0.0d8
com.apple.iokit.IOAHCIBlockStorage 301.40.2
com.apple.iokit.BroadcomBluetoothHostControllerUSBTransport 6.0.7f22
com.apple.iokit.IOBluetoothHostControllerUSBTransport 6.0.7f22
com.apple.iokit.IOBluetoothHostControllerTransport 6.0.7f22
com.apple.iokit.IOBluetoothFamily 6.0.7f22
com.apple.driver.usb.IOUSBHostHIDDevice 1.2
com.apple.driver.usb.networking 5.0.0
com.apple.driver.usb.AppleUSBHostCompositeDevice 1.2
com.apple.driver.usb.AppleUSBHub 1.2
com.apple.iokit.IOSCSIMultimediaCommandsDevice 404.30.3
com.apple.iokit.IOBDStorageFamily 1.8
com.apple.iokit.IODVDStorageFamily 1.8
com.apple.iokit.IOCDStorageFamily 1.8
com.apple.filesystems.hfs.encodings.kext 1
com.apple.iokit.IOAHCISerialATAPI 267.50.1
com.apple.iokit.IOFireWireFamily 4.7.2
com.apple.iokit.IO80211Family 1200.12.2
com.apple.driver.corecapture 1.0.4
com.apple.iokit.IOAHCIFamily 288
com.apple.driver.usb.AppleUSBEHCIPCI 1.2
com.apple.driver.usb.AppleUSBUHCIPCI 1.2
com.apple.driver.usb.AppleUSBUHCI 1.2
com.apple.driver.usb.AppleUSBEHCI 1.2
com.apple.driver.usb.AppleUSBXHCIPCI 1.2
com.apple.driver.usb.AppleUSBXHCI 1.2
com.apple.driver.usb.AppleUSBHostPacketFilter 1.0
com.apple.iokit.IOUSBFamily 900.4.1
com.apple.driver.AppleUSBHostMergeProperties 1.2
com.apple.driver.AppleEFINVRAM 2.1
com.apple.driver.AppleEFIRuntime 2.1
com.apple.iokit.IOSMBusFamily 1.1
com.apple.iokit.IOHIDFamily 2.0.0
com.apple.security.sandbox 300.0
com.apple.kext.AppleMatch 1.0.0d1
com.apple.driver.DiskImages 480.60.3
com.apple.driver.AppleFDEKeyStore 28.30
com.apple.driver.AppleEffaceableStorage 1.0
com.apple.driver.AppleKeyStore 2
com.apple.driver.AppleUSBTDM 439.70.3
com.apple.driver.AppleMobileFileIntegrity 1.0.5
com.apple.iokit.IOUSBMassStorageDriver 140.70.2
com.apple.iokit.IOSCSIBlockCommandsDevice 404.30.3
com.apple.iokit.IOSCSIArchitectureModelFamily 404.30.3
com.apple.iokit.IOStorageFamily 2.1
com.apple.driver.AppleCredentialManager 1.0
com.apple.driver.KernelRelayHost 1
com.apple.iokit.IOUSBHostFamily 1.2
com.apple.driver.usb.AppleUSBCommon 1.0
com.apple.driver.AppleBusPowerController 1.0
com.apple.driver.AppleSEPManager 1.0.1
com.apple.driver.IOSlaveProcessor 1
com.apple.iokit.IOReportFamily 31
com.apple.iokit.IOTimeSyncFamily 680.15
com.apple.iokit.IONetworkingFamily 3.4
com.apple.driver.AppleACPIPlatform 6.1
com.apple.driver.AppleSMC 3.1.9
com.apple.iokit.IOPCIFamily 2.9
com.apple.iokit.IOACPIFamily 1.4
com.apple.kec.pthread 1
com.apple.kec.Libm 1
com.apple.kec.corecrypto 1.0

EOF
Model: MacPro5,1, BootROM 144.0.0.0.0, 12 processors, 6-Core Intel Xeon, 3.33 GHz, 32 GB, SMC 1.39f5
Graphics: NVIDIA GeForce GTX 970, NVIDIA GeForce GTX 970, PCIe
Memory Module: DIMM 1, 8 GB, DDR3 ECC, 1333 MHz, 0x80CE, 0x4D33393342314B37304448302D5948392020
Memory Module: DIMM 2, 8 GB, DDR3 ECC, 1333 MHz, 0x80CE, 0x4D33393342314B37304448302D5948392020
Memory Module: DIMM 5, 8 GB, DDR3 ECC, 1333 MHz, 0x80CE, 0x4D33393342314B37304448302D5948392020
Memory Module: DIMM 6, 8 GB, DDR3 ECC, 1333 MHz, 0x80CE, 0x4D33393342314B37304448302D5948392020
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x8E), Broadcom BCM43xx 1.0 (5.106.98.102.30)
Bluetooth: Version 6.0.7f22, 3 services, 27 devices, 1 incoming serial ports
Network Service: Wi-Fi, AirPort, en2
PCI Card: NVIDIA GeForce GTX 970, Display Controller, Slot-1
PCI Card: NVIDIA GeForce GTX 970, NVDA,Parent, Slot-1
PCI Card: PXS2, USB eXtensible Host Controller, Slot-2
Serial ATA Device: HL-DT-ST DVD-RW GH41N
Serial ATA Device: OWC Mercury EXTREME Pro 6G SSD, 120.03 GB
Serial ATA Device: TOSHIBA MD04ACA400, 4 TB
Serial ATA Device: WDC WD6400AAKS-41H2B0, 640.14 GB
USB Device: USB Bus
USB Device: USB Optical Mouse
USB Device: USB Bus
USB Device: USB Bus
USB Device: Hub in Apple Pro Keyboard
USB Device: Apple Pro Keyboard
USB Device: USB Bus
USB Device: BRCM2046 Hub
USB Device: Bluetooth USB Host Controller
USB Device: USB Bus
USB Device: USB Bus
USB Device: USB 2.0 Bus
USB Device: USB 2.0 Bus
USB Device: USB 3.0 Bus
FireWire Device: built-in_hub, Up to 800 Mb/sec
Thunderbolt Bus:
 
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
How do I tell if the NB rivet has broken down?

Here's a screenshot under stress during a GeekBench 5 run. Note, this is almost immediately before a crash.

Since this is an international Forum, it's common to set the Temps readings in Celsius, tbh i dont know how to identify Fahrenheit , maybe you could post another screenshot in Celsius?

If the difference between NB diode and NB heatsink is more than ~12 degrees Celsius, there's poor contact between the heatsink and the die and a repaste with new rivets is strongly recommended.
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
Since this is an international Forum, it's common to set the Temps readings in Celsius, tbh i dont know how to identify Fahrenheit , maybe you could post another screenshot in Celsius?

If the difference between NB diode and NB heatsink is more than ~12 degrees Celsius, there's poor contact between the heatsink and the die and a repaste with new rivets is strongly recommended.
Sorry about that, I've edited my post to show Celsius. It looks though, like either idle or with fans at full blast, the difference tops out at 12 degrees C. Is there a way to visually check the status of the NB rivet? I'm not entirely sure where to look.
 
  • Like
Reactions: KeesMacPro
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
The NB is halfway underneath the heatsink CPU A.
Therefore it's easy to see when the CPUs are replaced/repasted but you didnt know that.
Anyway, the Temps NB look good to me.

What's not so good looking is the difference T between the heatsinks CPU and CPU diodes.
Geekbench 5 is not even stresstest and the delta is already 23degrees (both CPUs).
So the contact is not sufficient now.

You could try very carefully if you can turn the heatsink screws a bit more .
If that doesnt improve the Temps, you may have to repaste and reseat the heatsinks.
I suppose you installed delidded CPUs (?), if you have to repaste them, make sure all black sealant is removed from the top of the die around the copper center , where the plastic retainer is placed (careful with the capacitors (small silver blocks ) on the surface).
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
The NB is halfway underneath the heatsink CPU A.
Therefore it's easy to see when the CPUs are replaced/repasted but you didnt know that.
Anyway, the Temps NB look good to me.

What's not so good looking is the difference T between the heatsinks CPU and CPU diodes.
Geekbench 5 is not even stresstest and the delta is already 23degrees (both CPUs).
So the contact is not sufficient now.

You could try very carefully if you can turn the heatsink screws a bit more .
If that doesnt improve the Temps, you may have to repaste and reseat the heatsinks.
I suppose you installed delidded CPUs (?), if you have to repaste them, make sure all black sealant is removed from the top of the die around the copper center , where the plastic retainer is placed (careful with the capacitors (small silver blocks ) on the surface).
Yes, the CPUs have been delidded. I will probably try reinstalling tonight when I have more time—I did notice that there was a very small amount of the leftover black residue but didn't think that was an issue.

I did try testing the RAM as somebody suggested in another post, and running 2 x 8GB (which is half of the memory I had installed) has allowed me to complete a Geekbench5 a total of 2/7 times I tried it, with fans maxed out for all but the last instance.

Results:
1. Ran GB w/ full fans: Complete
2. Ran GB w/ full fans after 1-2 minutes: Crash & Reboot
3. Ran GB w/ full fans after reboot: Complete
4. Ran GB w/ full fans after 4-5 minutes: Complete
5. Ran GB w/ full fans after 15 minutes: Complete
6. Ran GB w/ full fans after <1 minute: Complete
7. Ran GB w/ Automatic fans after <1 minute: Crash & Reboot

Does additional RAM cause the machine to run hotter? If so, could it be that removing 2/4 sticks lowers the temp just enough with fans maxed out to not trigger the crash?

I've also seen it mentioned that X5680 CPUs require 1333 ram, while I've got 1066 installed. I don't know about that though because if the RAM were incompatible, it probably wouldn't allow me to get that far in the stress test?
 
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
Yes, the CPUs have been delidded. I will probably try reinstalling tonight when I have more time—I did notice that there was a very small amount of the leftover black residue but didn't think that was an issue.

I did try testing the RAM as somebody suggested in another post, and running 2 x 8GB (which is half of the memory I had installed) has allowed me to complete a Geekbench5 a total of 2/7 times I tried it, with fans maxed out for all but the last instance.

Results:
1. Ran GB w/ full fans: Complete
2. Ran GB w/ full fans after 1-2 minutes: Crash & Reboot
3. Ran GB w/ full fans after reboot: Complete
4. Ran GB w/ full fans after 4-5 minutes: Complete
5. Ran GB w/ full fans after 15 minutes: Complete
6. Ran GB w/ full fans after <1 minute: Complete
7. Ran GB w/ Automatic fans after <1 minute: Crash & Reboot

Does additional RAM cause the machine to run hotter? If so, could it be that removing 2/4 sticks lowers the temp just enough with fans maxed out to not trigger the crash?

I've also seen it mentioned that X5680 CPUs require 1333 ram, while I've got 1066 installed. I don't know about that though because if the RAM were incompatible, it probably wouldn't allow me to get that far in the stress test?

The MP 4,1/5,1 is a very robust machine able to handle a lot of things that were not even on the market when it was designed back in 2009.

TBH i've never come across posts about heat issues because of too much RAM or so.
The DUAL CPU version can e.g. run 128GB (8X16GB) without hickups or heat issues.

Even if reducing the RAM would make the MP finish some benchmarks without shutdown, I dont think this solves the problem.
Unless you can live with 2X4GB RAM installed .... ;)

When you upgrade the OEM Nehalem CPUs for a Westmere , the RAM can run faster (1333mHz) , when the RAM sticks installed are 1333mHz.
The Westmere CPUs work perfectly fine too with 1066mHz RAM installed, 1333mHz is an optional bonus.

One other thing not related to heat , is that its common to install the RAM in triple channel : 3sticks single CPU, 6 sticks dual CPU.
This is the fastest RAM setup.
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
The MP 4,1/5,1 is a very robust machine able to handle a lot of things that were not even on the market when it was designed back in 2009.

TBH i've never come across posts about heat issues because of too much RAM or so.
The DUAL CPU version can e.g. run 128GB (8X16GB) without hickups or heat issues.

Even if reducing the RAM would make the MP finish some benchmarks without shutdown, I dont think this solves the problem.
Unless you can live with 2X4GB RAM installed .... ;)

When you upgrade the OEM Nehalem CPUs for a Westmere , the RAM can run faster (1333mHz) , when the RAM sticks installed are 1333mHz.
The Westmere CPUs work perfectly fine too with 1066mHz RAM installed, 1333mHz is an optional bonus.

One other thing not related to heat , is that its common to install the RAM in triple channel : 3sticks single CPU, 6 sticks dual CPU.
This is the fastest RAM setup.
Thanks for the RAM info, and thank you SO much for the quick replies, I really really appreciate it. I've put so much sweat equity and money into this thing that it would kill me to pull the plug on it and shell out for a new computer.

I re-installed all 4 sticks and initially it's even more unreliable than before. It has now crashed while launching GB5 and then again while scrolling through the crash report. I then uninstalled and reinstalled all RAM and we seem to be back to "square one". I ran GB5 with all RAM reinstalled and as expected, it crashed.

My next move will be to remove to remove that residue from the tops of the processors. If that does not work, I think it's safe to say its a heat related issue. I've checked and tightened the screws as much as I can without damaging them, but to no avail. What do you think my next step should be after that? It just seems strange to me that my original CPUs work, but two separate pairs of X5680s from different sources produce the exact same problem. Do you think trying X5675s could help with the heat issue since they have similar power requirements to the originals?

I also noticed something on the CPU tray that I'm not sure is normal or not. Wondering if you can tell me what you think. Is the circled area in this image something to be concerned about?

IMG_3811.JPG
 
Comment

tsialex

macrumors G3
Jun 13, 2016
8,767
9,355
Thanks for the RAM info, and thank you SO much for the quick replies, I really really appreciate it. I've put so much sweat equity and money into this thing that it would kill me to pull the plug on it and shell out for a new computer.

I re-installed all 4 sticks and initially it's even more unreliable than before. It has now crashed while launching GB5 and then again while scrolling through the crash report. I then uninstalled and reinstalled all RAM and we seem to be back to "square one". I ran GB5 with all RAM reinstalled and as expected, it crashed.

My next move will be to remove to remove that residue from the tops of the processors. If that does not work, I think it's safe to say its a heat related issue. I've checked and tightened the screws as much as I can without damaging them, but to no avail. What do you think my next step should be after that? It just seems strange to me that my original CPUs work, but two separate pairs of X5680s from different sources produce the exact same problem. Do you think trying X5675s could help with the heat issue since they have similar power requirements to the originals?

I also noticed something on the CPU tray that I'm not sure is normal or not. Wondering if you can tell me what you think. Is the circled area in this image something to be concerned about?

View attachment 1734154
Did you inspected the push pins for the north bridge heatsink? Remove the PCB from the tray and look at the bottom, check for any cracks on the nylon head of the push pins.

You can unbent any heatsink fins while doing it.
 
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
Thanks for the RAM info, and thank you SO much for the quick replies, I really really appreciate it. I've put so much sweat equity and money into this thing that it would kill me to pull the plug on it and shell out for a new computer.

I re-installed all 4 sticks and initially it's even more unreliable than before. It has now crashed while launching GB5 and then again while scrolling through the crash report. I then uninstalled and reinstalled all RAM and we seem to be back to "square one". I ran GB5 with all RAM reinstalled and as expected, it crashed.

My next move will be to remove to remove that residue from the tops of the processors. If that does not work, I think it's safe to say its a heat related issue. I've checked and tightened the screws as much as I can without damaging them, but to no avail. What do you think my next step should be after that? It just seems strange to me that my original CPUs work, but two separate pairs of X5680s from different sources produce the exact same problem. Do you think trying X5675s could help with the heat issue since they have similar power requirements to the originals?

I also noticed something on the CPU tray that I'm not sure is normal or not. Wondering if you can tell me what you think. Is the circled area in this image something to be concerned about?
As for the NB heatsink fins : no harm done, you can bend them straight easily.

As suggested by @tsialex (and me post #3) it might be a smart move to replace the NB rivets ,since you will have to repaste the cPU heatsinks anyway so you will have access easily.
Although the delta T (12C) looks good for now, it might give you some peace of mind in the future.

I'm quite sure the heat issue is not RAM related, but you could take a look after startup on the CPU pcb : there are 8 little leds indicating the RAM , if 1 is lid, the ram is defective.


Comparing different CPUS: I've got x5675 installed in a MP and also x5690 in a dual MP. TBH i dont notice any difference in T under normal conditions between them.
I remember when Mojave came out i installed it on a MP dual CPU and noticed that the CPUs (2,26GHz Nehalem) were running at ~ 50C at idle.
Also under medium load these CPUs were at ~ 65/70C.
After replacing them with X5690 , i noticed at idle and medium use that the CPUs run much cooler.
So although a X5690 is more powerful , it doesnt necesarily mean it runs hotter.

As you mentioned (and i understand) , you invested money , time and effort to get this magnificent machine upgraded.
I 'd take the time to repaste and reseat the CPU heatsinks , it is very well posible and has been done by lots of others .
Note that when you mount the heatsinks they stay perfectly levelled with a good quality thermal paste (e.g.Arctic Silver, MX-4, TG Kryonaut etc,)and the right amount.

This is a link for the technician guide , there's a chapter about repasting and lots of other practical things:

Hang in there, man!
 
Last edited:
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
As for the NB heatsink fins : no harm done, you can bend them straight easily.

As suggested by @tsialex (and me post #3) it might be a smart move to replace the NB rivets ,since you will have to repaste the cPU heatsinks anyway so you will have access easily.
Although the delta T (12C) looks good for now, it might give you some peace of mind in the future.

I'm quite sure the heat issue is not RAM related, but you could take a look after startup on the CPU pcb : there are 8 little leds indicating the RAM , if 1 is lid, the ram is defective.


Comparing different CPUS: I've got x5675 installed in a MP and also x5690 in a dual MP. TBH i dont notice any difference in T under normal conditions between them.
I remember when Mojave came out i installed it on a MP dual CPU and noticed that the CPUs (2,26GHz Nehalem) were running at ~ 50C at idle.
Also under medium load these CPUs were at ~ 65/70C.
After replacing them with X5690 , i noticed at idle and medium use that the CPUs run much cooler.
So although a X5690 is more powerful , it doesnt necesarily mean it runs hotter.

As you mentioned (and i understand) , you invested money , time and effort to get this magnificent machine upgraded.
I 'd take the time to repaste and reseat the CPU heatsinks , it is very well posible and has been done by lots of others .
Note that when you mount the heatsinks they stay perfectly levelled with a good quality thermal paste (e.g.Arctic Silver, MX-4, TG Kryonaut etc,)and the right amount.

This is a link for the technician guide , there's a chapter about repasting and lots of other practical things:

Hang in there, man!
Thanks for the words of encouragement. I’ve ordered the parts for the NB rivet replacement and will report back after making the replacement and reseating the heat sinks. Really hoping that does the trick.
 
  • Like
Reactions: KeesMacPro
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
As for the NB heatsink fins : no harm done, you can bend them straight easily.

As suggested by @tsialex (and me post #3) it might be a smart move to replace the NB rivets ,since you will have to repaste the cPU heatsinks anyway so you will have access easily.
Although the delta T (12C) looks good for now, it might give you some peace of mind in the future.

I'm quite sure the heat issue is not RAM related, but you could take a look after startup on the CPU pcb : there are 8 little leds indicating the RAM , if 1 is lid, the ram is defective.


Comparing different CPUS: I've got x5675 installed in a MP and also x5690 in a dual MP. TBH i dont notice any difference in T under normal conditions between them.
I remember when Mojave came out i installed it on a MP dual CPU and noticed that the CPUs (2,26GHz Nehalem) were running at ~ 50C at idle.
Also under medium load these CPUs were at ~ 65/70C.
After replacing them with X5690 , i noticed at idle and medium use that the CPUs run much cooler.
So although a X5690 is more powerful , it doesnt necesarily mean it runs hotter.

As you mentioned (and i understand) , you invested money , time and effort to get this magnificent machine upgraded.
I 'd take the time to repaste and reseat the CPU heatsinks , it is very well posible and has been done by lots of others .
Note that when you mount the heatsinks they stay perfectly levelled with a good quality thermal paste (e.g.Arctic Silver, MX-4, TG Kryonaut etc,)and the right amount.

This is a link for the technician guide , there's a chapter about repasting and lots of other practical things:

Hang in there, man!
Update: Reseated the northbridge heatsink, cleaning off the old thermal paste, and applying new thermal paste as well as replaced the rivets (mine were still intact but I replaced them anyway). I also seated the heatsinks above the processors and replaced the thermal paste. ...still no luck. When I first booted up the machine it restarted on it's own immediately after logging in. After restarting, I got in and ran GB5 with the fans at full power (if it passed I was then planning on dialing the fans back to find the "sweet spot"). Made it up the very end, I'm talking 97% of the test before it rebooted. When crashing from GB5, it's always at the very end of the test—not sure what that means, if anything.

I'm attaching some pics from the physical process as well as temps when idle and temps right before the crash. Any other thoughts?
 

Attachments

  • IMG_3820.JPG
    IMG_3820.JPG
    535.2 KB · Views: 21
  • IMG_3835.JPG
    IMG_3835.JPG
    458.1 KB · Views: 20
  • IMG_3834.JPG
    IMG_3834.JPG
    465.2 KB · Views: 20
  • IMG_3832.JPG
    IMG_3832.JPG
    484.9 KB · Views: 25
  • IMG_3829.JPG
    IMG_3829.JPG
    346.4 KB · Views: 22
  • IMG_3828.JPG
    IMG_3828.JPG
    325.8 KB · Views: 21
  • IMG_3825.JPG
    IMG_3825.JPG
    349.1 KB · Views: 20
  • IMG_3824.JPG
    IMG_3824.JPG
    430.2 KB · Views: 21
  • Screen Shot 2021-02-25 at 7.59.41 PM.png
    Screen Shot 2021-02-25 at 7.59.41 PM.png
    2.4 MB · Views: 22
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
Update: Reseated the northbridge heatsink, cleaning off the old thermal paste, and applying new thermal paste as well as replaced the rivets (mine were still intact but I replaced them anyway). I also seated the heatsinks above the processors and replaced the thermal paste. ...still no luck. When I first booted up the machine it restarted on it's own immediately after logging in. After restarting, I got in and ran GB5 with the fans at full power (if it passed I was then planning on dialing the fans back to find the "sweet spot"). Made it up the very end, I'm talking 97% of the test before it rebooted. When crashing from GB5, it's always at the very end of the test—not sure what that means, if anything.

I'm attaching some pics from the physical process as well as temps when idle and temps right before the crash. Any other thoughts?
I just ran GB5 on 2 different MP4,1>5,1 .
They didnt shut down (it would have surprised me very much) .
Look at the fan speeds, just for comparising.

The only thing I can think of is that it looks like you applied too much thermal paste.
It should really be the size of a rice corn , without spreading it manually (the heatsink will once attached).
Obviously my thoughts are based on the info I gather from the text and pictures.....
 

Attachments

  • GB5 Temps Dual X5690.png
    GB5 Temps Dual X5690.png
    482.5 KB · Views: 8
  • GB5 Temps Single X5675.png
    GB5 Temps Single X5675.png
    493.6 KB · Views: 8
Last edited:
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
I just ran GB5 on 2 different MP4,1>5,1 .
They didnt shut down (it would have surprised me very much) .
Look at the fan speeds, just for comparising.

The only thing I can think of is that it looks like you applied too much thermal paste.
It should really be the size of a rice corn , without spreading it manually (the heatsink will once attached).
Obviously my thoughts are based on the info I gather from the text and pictures.....
Can you share with me what your fan settings are?

I've also had it suggested to me elsewhere that it could be a power supply issue. I'm going to test by removing one CPU and seeing if I still get the crashes. If everything works fine each CPU individually, then it looks like I have 3 choices: 1. Run with just one CPU, 2. replace the PSU, 3. downgrade to x5675 which has a lower power draw.

Thinking I'd go with my 3rd option as the seller of the X5680s will accept a return and I can exchange for x5675. Just wondering if the PSU is starting to fail and I'll run into problems down the road. Any idea if there's a way to check the health of the PSU?
 
Comment

KeesMacPro

macrumors 6502a
Nov 7, 2019
664
187
Can you share with me what your fan settings are?

I've also had it suggested to me elsewhere that it could be a power supply issue. I'm going to test by removing one CPU and seeing if I still get the crashes. If everything works fine each CPU individually, then it looks like I have 3 choices: 1. Run with just one CPU, 2. replace the PSU, 3. downgrade to x5675 which has a lower power draw.

Thinking I'd go with my 3rd option as the seller of the X5680s will accept a return and I can exchange for x5675. Just wondering if the PSU is starting to fail and I'll run into problems down the road. Any idea if there's a way to check the health of the PSU?
No problem to share my fan setup but TBH I'd look for the cause of this issue first with all fans on auto.
Setting the fans is IMO mainly for searching a balance between noise and Temps, depending on the workload and use.
Although i left my fan setup as is running GB5 the rpms were all at min.

Removing 1 CPU is not recommended all fans will run at fullspeed because of failmode .
your options: 1) not a viable option for "normal" use 2) It's impossible to troubleshoot without having all details about all hardware and software installed ,so I wouldnt exclude the issue is related to something else 3) I assume a X5675 will not change anything at all .


Perhaps it is the PSU ,you could take a look at the Technician Guide i posted for troubleshooting (post#14)

I'd reduce the MP to a minimum setup :no PCI cards, 1 GPU, only 1 HDD/SSD with a clean installed OS to start troubleshooting methodically.
 
Last edited:
Comment

amstel78

macrumors 6502
Aug 12, 2018
324
114
Have you tested your RAM modules? Try pulling all sticks out save for two (one in each CPU bank) and testing. If it crashes, replace those two sticks with another pair and test again. Odd things can happen with synthetic benchmarks that are memory heavy.

And agreed with @KeesMacPro. Leave fans set to auto. You shouldn't have to blast them to get your system to run stably. I've let Prime95 set to small FFTs go for more than 2 hours with fans set to auto without any issues. Both CPU diode readouts never exceeded thermal throttling thresholds.
 
  • Like
Reactions: KeesMacPro
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
No problem to share my fan setup but TBH I'd look for the cause of this issue first with all fans on auto.
Setting the fans is IMO mainly for searching a balance between noise and Temps, depending on the workload and use.
Although i left my fan setup as is running GB5 the rpms were all at min.

Removing 1 CPU is not recommended all fans will run at fullspeed because of failmode .
your options: 1) not a viable option for "normal" use 2) It's impossible to troubleshoot without having all details about all hardware and software installed ,so I wouldnt exclude the issue is related to something else 3) I assume a X5675 will not change anything at all .


Perhaps it is the PSU ,you could take a look at the Technician Guide i posted for troubleshooting (post#14)

I'd reduce the MP to a minimum setup :no PCI cards, 1 GPU, only 1 HDD/SSD with a clean installed OS to start troubleshooting methodically.
I've got a clean install OS on a separate drive — I'll remove PCI cards and my Nvidia GPU and the other dives and report back. I think I've even got the original RAM around here somewhere. I'll do some testing this weekend and report back.

Again, thanks for all your help.
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
Have you tested your RAM modules? Try pulling all sticks out save for two (one in each CPU bank) and testing. If it crashes, replace those two sticks with another pair and test again. Odd things can happen with synthetic benchmarks that are memory heavy.

And agreed with @KeesMacPro. Leave fans set to auto. You shouldn't have to blast them to get your system to run stably. I've let Prime95 set to small FFTs go for more than 2 hours with fans set to auto without any issues. Both CPU diode readouts never exceeded thermal throttling thresholds.
I had tested a few days ago but can't hurt to test again. I think I even have the original RAM that I can test.
 
  • Like
Reactions: amstel78
Comment

amstel78

macrumors 6502
Aug 12, 2018
324
114
I had tested a few days ago but can't hurt to test again. I think I even have the original RAM that I can test.
Pull everything except the GPU and 2 sticks of RAM and try testing again. Let us know what results.

Also and just out of curiosity, right after your system crashes, what does the following command in terminal return: log show --predicate 'eventMessage contains "Previous shutdown cause"' --last 10m. You can alter the query time by changing 10m to 1h for instance (minutes vs. hours).
 
Comment

sgentile92

macrumors member
Original poster
Jan 29, 2015
53
5
Pull everything except the GPU and 2 sticks of RAM and try testing again. Let us know what results.

Also and just out of curiosity, right after your system crashes, what does the following command in terminal return: log show --predicate 'eventMessage contains "Previous shutdown cause"' --last 10m. You can alter the query time by changing 10m to 1h for instance (minutes vs. hours).
I've gotten both 0 and -128 in the two times I tried this (separate crashes but triggered from running GB5). Found a resource that helps define what these codes mean. Apparently, 0 indicates power disconnected and -128 is a memory issue? I can try resetting the PRAM and test a few more times and see if I get any different codes. Any reason I wouldn't reliably get the same code, especially if it's a RAM issue? What's your take on the 0 code?
 
Comment

amstel78

macrumors 6502
Aug 12, 2018
324
114
I've gotten both 0 and -128 in the two times I tried this (separate crashes but triggered from running GB5). Found a resource that helps define what these codes mean. Apparently, 0 indicates power disconnected and -128 is a memory issue? I can try resetting the PRAM and test a few more times and see if I get any different codes. Any reason I wouldn't reliably get the same code, especially if it's a RAM issue? What's your take on the 0 code?
Kernel gets messed up at times as far as shutdown causes goes. I was getting a whole bunch of 3's and 0's at one point on my cMP3,1 until I decided to reboot into Snow Leopard. After that, rebooting back into Catalina reset how kernel identifies a reboot or shutdown properly. This is likely due to updating something in NVRAM.

Same applies to my cMP5,1 with Mojave... usually booting into another OS fixes it.

As far as -128 goes? That means the OS didn't know how to identify. From internet resources though, it points to RAM:
1614369500924.png
 
Comment
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.