Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Caro massimike,

La tua megliore soluzione sarebbe di chiamare applecare e parlare al ripresentante sul telefono spiegandoli il tuo problemo. E definitivamente qualche cosa di serio e al'ora sei in fortuna perche sei coperto dai servizi di Applecare. Non avere paura. Ti risolverano tutto. Scusa se non scrivo molto bene l'italiano... ma lo parlo meglio che lo scrivo (essendo un canadese di origine italiana).

ALL this is in ITALIAN so don't worry ppl lol :D
 
Caro massimike,

La tua megliore soluzione sarebbe di chiamare applecare e parlare al ripresentante sul telefono spiegandoli il tuo problemo. E definitivamente qualche cosa di serio e al'ora sei in fortuna perche sei coperto dai servizi di Applecare. Non avere paura. Ti risolverano tutto. Scusa se non scrivo molto bene l'italiano... ma lo parlo meglio che lo scrivo (essendo un canadese di origine italiana).

ALL this is in ITALIAN so don't worry ppl lol :D

Your italian is good, better than my english. I think I need to go through with this issue as this is really not that good! Thanks mate!
 
Hello thread,

I have also experienced a growing number of KPs on my old Dual 2.3 G5. I ran AHT, Rember, Tech Tools, Disc Warrior, all report OK. Swapped out ram for test and left out original 256 Apple pair, 8gb OWC remains. Removed all externals, still had panics. Another forum suggested to a fellow KP victim to make sure "processor performance" was on "highest" setting, not "automatic." Found under SYSTEM PREF/ENERGY SAVER/OPTIONS

I set mine on "highest" and have been KP free for a week, a record for this G5. This may be covering real problem but so far so good!

Ciao

Do you use your mac to do rendering as I do not. I do not think I am stressing a quad core processor 2.93 with photo retouching or even streaming with youtube.
I will try your setting and let you know.
thanks mate.
 
I'm experiencing the same problems with a Mac Pro Nehalem 2,66Ghz Quad core. I've read on apple discussions that the problem might be the processor tray.
I can regenerate the problem by encoding a movie or something, and doing a memory test with techtool pro. Techtool says that the ram is OK, but when you look at the kernel.log file in console, you see a whole list of errors. Besides, system profile shows "ecc-errors" for the third DIMM (the one the closest to the processor, the hottest one).
I can avoid the errors by using SMCfancontrol, but this doesn't look like a very good solution. A strange thing is that the fans always have the same (slow) speed, even if the computer is running very hot while encoding video. This looks like a firmware bug to me. Unfortunately, resetting the SMC didn't resolve the problem.
I think I'll visit the apple premium reseller tomorrow.
 
I'm experiencing the same problems with a Mac Pro Nehalem 2,66Ghz Quad core. I've read on apple discussions that the problem might be the processor tray.
I can regenerate the problem by encoding a movie or something, and doing a memory test with techtool pro. Techtool says that the ram is OK, but when you look at the kernel.log file in console, you see a whole list of errors. Besides, system profile shows "ecc-errors" for the third DIMM (the one the closest to the processor, the hottest one).
I can avoid the errors by using SMCfancontrol, but this doesn't look like a very good solution. A strange thing is that the fans always have the same (slow) speed, even if the computer is running very hot while encoding video. This looks like a firmware bug to me. Unfortunately, resetting the SMC didn't resolve the problem.
I think I'll visit the apple premium reseller tomorrow.

Put heat spreaders on your DIMMs and hook up the computerto a UPS to make sure current fluctuations don't hose yr thirsty Pro.
 
In 3 panic reports:

Memory Operation: read
Machine-specific error: Read ECC

From those reports:

Uncorrected error
DIMM: 0
Channel: 2

Uncorrected error
DIMM: 1
Channel: 2

Uncorrected error
DIMM: 0
Channel: 2

In one of those reports you have:

COR_ERR_CNT: 31720

Presumably that means "Corrected Error Count".

Google search:

http://www.google.co.uk/search?q="Machine-specific+error:+Read+ECC"+"mac+pro"
 
Put heat spreaders on your DIMMs and hook up the computerto a UPS to make sure current fluctuations don't hose yr thirsty Pro.

Thanks for the suggestion, but I'm not going to try things out on my 6-months old mac pro. Such a machine should be working fine without adaptations. I just went to the Apple Premium Reseller. They told me that I need to get the RAM or the motherboard replaced.


Edit: I think I might have found the problem. I was able to regenerate the errors by running a lot of CPU-intensive processes.
On that moment, I took a look at the temperatures with hardware monitor. I saw that the cores of the Nehalem processor reached 78°C (174°F). The maximum temperature of the W3520 processor (mentioned on intel.com) is 67,9°C (154°F).
Since the memory controller is integrated in the Nehalem processors, it might be affected by the heath, which can cause memory problems.
It also explains why the problems immediately stop when I use SMCfancontrol.
 
Thanks for the suggestion, but I'm not going to try things out on my 6-months old mac pro. Such a machine should be working fine without adaptations. I just went to the Apple Premium Reseller. They told me that I need to get the RAM or the motherboard replaced.


Edit: I think I might have found the problem. I was able to regenerate the errors by running a lot of CPU-intensive processes.
On that moment, I took a look at the temperatures with hardware monitor. I saw that the cores of the Nehalem processor reached 78°C (174°F). The maximum temperature of the W3520 processor (mentioned on intel.com) is 67,9°C (154°F).
Since the memory controller is integrated in the Nehalem processors, it might be affected by the heath, which can cause memory problems.
It also explains why the problems immediately stop when I use SMCfancontrol.

When I called AppleCare about this, their response was, "Reseat all your RAM."

I asked, "In the user manual, it shows heat spreaders (heat sinks) on all the memory. Do you think it's a good idea if I put heat spreaders at least on DIMM 0, the one reported in my panics?"

Their reply: "Yes."

I put a heat spreader on just DIMM 0, and I have not seen the temp rise over 64C, even encoding to AVCHD in Toast 10 which uses all the cores, AND running Second Life in the background with the heat of the GTX 285.

Also, when I reseated my RAM, I checked all the connections inside the machine especially to the optical drives, since I noticed many of us with this problem have multiple optical drives.

Another thing I did was to avoid using any software that might make pre-1.6 Java calls, such as Camino.

I also had the latest seed that came out the day after I reported these panics to Apple so it is possible that had something to do with it, though I cannot be certain since it makes no mention. It could also have just been from zapping PRAM and resetting SMC (when the ram was re-seated).

If a fan's power cable was loose don't you think that could affect their efficacy? Also the DIMMs that are closest to the outer door are not going to catch as much draft from the fans. That's why heat spreaders are so effective! Why do you think Apple shows the RAM with heat spreaders in the user manual itself? Y'know?

One last thing: what is the ambient temperature in your room?
 
In one of those reports you have:
Presumably that means "Corrected Error Count".

Google search:

http://www.google.co.uk/search?q="Machine-specific+error:+Read+ECC"+"mac+pro"

The ECC chips on the RAM could also be overheating. I just think the CPUs are much less likely to be the source of this, seeing as how they have giant industrial heat sinks on them while the RAM is sitting there roasting by the fire. I mean it's so cheap to put heat spreaders... Why not at least try it?
 
Today I'll bring the mac to an Apple premium resseller.
I've also thought about overheated dimms, but the errors start already at 36°C (97°F). The datasheet of the dimms says that they should still work fine at 70°C (158°F). Besides, hardware monitor shows the same temperatures for at least 2 of the 3 dimms, while only the dimm in slot 3 causes errors. (Also when I change places)

The CPU indeed has a giant heat spreader, but if the fans barely work, it can still overheat, I think. Especially when all the cores are being used for 100%
I also did reset the SMC, but it didn't matter. All the fans are properly connected, since I can see them work, and I can adjust their speed with SMCfancontrol.

Ambient temperature isn't abnormal. about 20°C when I turn the Mac Pro on, and about 24°C after an hour or two :)

(Besides, I didn't add an optical drive. I use the original configuration, with only 2 extra hard drives of 1TB)
 
Today I've been to the apple reseller. They told me the mac was fixed, so I asked what the problem was... They just replaced a RAM module. I had told them before that bank 3 was giving errors, and changing the place of the modules didn't solve this problem. So it was clear that the modules are fine...
So I asked if I could test the Mac Pro right away in the shop, which wasn't a problem for them. They even told me "Look, you're only running the same tests as we did".
After only two minutes, I was able to regenerate the errors. The Mac Pro is now AGAIN in for repair...
 
Today I've been to the apple reseller. They told me the mac was fixed, so I asked what the problem was... They just replaced a RAM module. I had told them before that bank 3 was giving errors, and changing the place of the modules didn't solve this problem. So it was clear that the modules are fine...
So I asked if I could test the Mac Pro right away in the shop, which wasn't a problem for them. They even told me "Look, you're only running the same tests as we did".
After only two minutes, I was able to regenerate the errors. The Mac Pro is now AGAIN in for repair...


In my case, reversing the DIMMS changed the error from "Channel 2" to "Channel 0" ... so I am guessing for me, it is likely to be the module (maybe, its ECC chip?).

But one thing is, I am now told there are red LEDs on th dauhterboard itself, in the space between the two CPUs, next to each DIMM slot. If one of those lights turns on, it's a problem with the DIMM or that slot. Even if it passes Apple Hardware Tes and TechTool Pro, which mine do, the red light indicates a problem not detectable by those, he said — though he could not elaborate past that.

Another question I asked the AppleCare senior tech was, what order do you put differen brands of DIMMs in, if you have four of one brand, and two of another? Because I have four that I am guessing are factory RAM (Micron) and two that are from DMS, which is a Better Business Bureau A+ rated company. The DMS DIMMs are also sporting Micron chips, but the silicon color and wiring looks different than the other four modules).

Currently mine are like this:
(right front banks:)
4-empty
3-Apple
2-Apple
1-DMS
(left rear banks:)
5-DMS
6-Apple
7-Apple
8-empty

As my errors are now in Channel 0, which I *guess* corresponds to Slot 1 and/or 5, then one or both DMS chip(s) seem(s) to be the culprit... but wait. I asked him if the two DMS should be on the same side and he said yes, he reckoned they should both be on the same CPU, like this:

4-empty
3-Apple
2-DMS
1-DMS
(left rear banks:)
5-Apple
6-Apple
7-Apple
8-empty

Because if you just have 3 DIMMS it would be:

4-empty
3-Apple
2-DMS
1-DMS
(5-8 side empty)

I haven't had a chance to try it yet, nor to look for the red LEDs. I will check tonight and report back. This is such a weird issue.

Btw: what is the easiest way to cause the panic on yours?
 
I didn't see any leds burning next to the DIMM banks. I know there are some on the main board next to the SMC reset button, but they also didn't show any problems.
The problem also wasn't detectable by apple hardware test or techtool pro, but if you watch the kernel.log file during that test, you can see the dimm errors coming.

I can't easily regenerate the kernel panic, but I can regenerate the ECC-errors in the kernel.log file and in system profile.
To do that, I play and/or encode a high definition movie, run one or two benchmark tests at the same time, and do a memory test with techtool pro. This is quite CPU-intensive, so the errors start coming after about 2 minutes. The computer doesn't need to be turned on for hours in this case.
 
I didn't see any leds burning next to the DIMM banks. I know there are some on the main board next to the SMC reset button, but they also didn't show any problems.
The problem also wasn't detectable by apple hardware test or techtool pro, but if you watch the kernel.log file during that test, you can see the dimm errors coming.

I can't easily regenerate the kernel panic, but I can regenerate the ECC-errors in the kernel.log file and in system profile.
To do that, I play and/or encode a high definition movie, run one or two benchmark tests at the same time, and do a memory test with techtool pro. This is quite CPU-intensive, so the errors start coming after about 2 minutes. The computer doesn't need to be turned on for hours in this case.

Hmm .. are ALL your DIMMs the same brand/model/size? Or do they vary? If so what positions are you putting them in? What DIMM errors do you get?

Thanks.
 
I'm using the original 3x 1GB from Apple, I have never touched them, and they are placed like they should (3: empty, 2: apple dimm, 1: apple dimm, 0: apple dimm).
The apple technician replaced the "defective" dimm, but it didn't make any difference. So it's clear to me that the problem isn't caused by the dimms, but the bank itsself is defective.
Now they are going to replace the processor tray, but not the processor. They said this was the last option.
So I told them about the built-in memory controller in the nehalem CPU, and successful stories on the internet about replacing the processor. He said that they didn't think about this, and that they should try it. Seems like they don't really know what they're doing, I have to tell them myself what they should do...

I don't know the exact errors in the kernel log (and the computer is now in for repair, so I can't check), but it was something like error on channel 0 dimm 2...
And System profile shows for dimm 3: ECC-errors.
 
Switching my DIMMs around proved it to just be a faulty DIMM, because the Channel # in the panic changed. The Channel # refers to the DIMM slot. Channel 2 is Slots 3-4 and 5-6. Channel 1 is Slots 2 and 6. Channel 0 is Slots 1 and 5. It always reports DIMM 0 unless it was Channel 2 Slot 4 or 8, which I imagine could be DIMM 1 since that channel has two DIMMs per CPU.

The reason Slots 1 & 5 are both DIMM 0 is because it's the processor that is reporting this, an each processor only sees three Channels (0, 1, 2). How to isolate if it's the DIMMs is to reverse what order they are in and see if the channel # in the panic changes. You may also have to just swap the DIMMs in Slots 5 and 2, for instance, to see if a Channel 0 error was referring to Slot 1 or Slot 5 (it will change to Channel 1 if the bad DIMM was in Slot 5 and you move it to Slot 2; it will remain Channel 0 if it was in Slot 1).

In my case once I removed the bad DIMM, the panics went away even under the heaviest load testing. 10gb of RAM is still plenty for now.
 
I'm using the original 3x 1GB from Apple, I have never touched them, and they are placed like they should (3: empty, 2: apple dimm, 1: apple dimm, 0: apple dimm).
The apple technician replaced the "defective" dimm, but it didn't make any difference. So it's clear to me that the problem isn't caused by the dimms, but the bank itsself is defective.
Now they are going to replace the processor tray, but not the processor. They said this was the last option.
So I told them about the built-in memory controller in the nehalem CPU, and successful stories on the internet about replacing the processor. He said that they didn't think about this, and that they should try it. Seems like they don't really know what they're doing, I have to tell them myself what they should do...

I don't know the exact errors in the kernel log (and the computer is now in for repair, so I can't check), but it was something like error on channel 0 dimm 2...
And System profile shows for dimm 3: ECC-errors.

Well which one did they replace? Because Channel 2 refers to Slot 3, the same one giving you the ECC errors. The text "DIMM 0" does NOT refer to Slot 1; it just refers to the first DIMM in Channel 2. And since you only have one DIMM on Channel 2 (because Slot 4 would be DIMM 1, Channel 2), therefore Slot 3 is where your problem lies.

Does the error happen if you test with just two DIMMs, in Slots 1 & 2? What happens if you test in Slots 1, 2, and 4? Sounds like your Slot 3 is bad if yes, and just let them swap your board. The CPU being defective would seem to me to make a much more random effect than just Channel 2 always where the error is.
 
Today, my mac pro was "repaired" for the second time. This was the error:
AppleTyMCEDriver ReadCorrectable: Detected 1 errors on channel 2 dimm 0 package 0

So system profile showed ECC errors for dimm 3.

The first time, they replaced dimm 3, with no effect. Now they changed the processor board (without the processor, while I insisted to replace it too). I came home with the computer (drove 50 miles for the third time), and AGAIN the ECC errors showed up after some CPU-intensive tasks.
I'm getting tired of this, but I think I've got no other option than taking it to the Apple Reseller again...
 
Today, my mac pro was "repaired" for the second time. This was the error:
AppleTyMCEDriver ReadCorrectable: Detected 1 errors on channel 2 dimm 0 package 0

So system profile showed ECC errors for dimm 3.

The first time, they replaced dimm 3, with no effect. Now they changed the processor board (without the processor, while I insisted to replace it too). I came home with the computer (drove 50 miles for the third time), and AGAIN the ECC errors showed up after some CPU-intensive tasks.
I'm getting tired of this, but I think I've got no other option than taking it to the Apple Reseller again...

Flubber I think we are on the same boat :(.
What I tried yesterday is testing the MP with a new fresh OS X installation. Only Apple s/w installed plus handbrake to stress the cpu.
After 5 mins of encoding plus safari with an youtube video KP appeared again.
The only thing I cannot understand is that I tried the test again and system appeared to be ok. As a matter of fact I noticed that KPs usually happened once and not a second time in same day session.
This will be pretty tough to sort the problem out even for apple repairer!!! :(:(:(
Really feel frightned. I would like my MP will be replaced :rolleyes:
 
I think you have to go through a lot of trouble before you receive a new Mac Pro...
You can look in the console (utilities) at the kernel.log file. In my case, I can see ECC errors showing up if the computer runs hot. So you might be able to see the KP coming.

I went to the Apple Reseller for the third time today. They actually just don't know what to do anymore (last time they said that a new processor board was the last possibility). So they decided to blame me. They never test the mac thoroughly, and if I test it (with movie encoding and benchmark tests), I can reproduce the errors after a few minutes. They told me that geekbench probably generates these errors and that I should come back when I have a real problem, or call Apple. The guy said that if he would run those heavy tests on his own macbook, it would also give some errors and that it was completely normal.
They also told me that the only thing they hadn't replaced was the processor. So I told them again about the built-in memory controller, and the guy just said that he "doesn't give a f*ck" about technical information because he wouldn't understand it :s
Anyway, I asked if I could test the Mac Pro in the shop, and this time with only movies, and not benchmark tests. After 15 minutes, the processor was at 82°C and the ECC errors showed up again, so I could prove them wrong for the third time. When I said that I had already driven 200 miles for a computer that still isn't fixed, the guy became angry. Quite funny because he shauted for a whole shop that it wasn't his fault that they couldn't fix my broken mac. Looks like they're getting quite desperate. The mac is now in for repair again... I think it deserves a soap :p
 
Hi guys,
Keep the fingers crossed, it seems I solve this problem out by replacing the CPU. I made some test and the machine now seems stable. I think now, after having read some post around, if the RAM is ok should fall on the CPU.
I will keep you informed. I really hope to be out of the nightmare.
Thanks to you all for your support and advices.
Cheers, Max.
 
Try ...

... resetting the SMC: Turn off your Mac Pro, disconnect it from utility and press the startup button for about 10 seconds. Replug it and restart.

Do also a careful and throrough testing of your RAM slots. Start with one RAM bank test each of your modules; it takes some time, but it´s worth it.

If that doesn´t work, call for Apple help, because this is definitely a hardware problem - it´s now just to figure out, whether it´s temporary or permanent.
 
Quad G5 2.5 Ghz - KP with 10.5.8

Recently I just upgrade to Leopard 10.5.8 using the 10.5.1 disk retail (archive and install) and adding the Time Capsule. Since this day I have an average 2/3 times KPs per week (mostly during the night when the machine is in sleep mode ?!? - the machine fans make big noise when it got KPs).

I have tried:
1) Reinstall a clean 10.5.8 (archive and install)
2) Eliminate all start-up items
3) Reset PRAM
4) Turn the Time Machine off
5) Do all tests with TechTool Deluxe and Rember (loop test on RAMs)
6) Repair permissions (using Disk Utility on the install disk) but there are always a lot of stuff... According to Apple you can safely ignore those messages:confused:

At this point I have 2 options in my mind:
a. Going back to Tiger :(
b. Looking a new Intel machine :confused:

One more thing: Since I upgrade to Leopard, it seems that the machine is not really in sleep mode (I still hear the fans). I have to force The machine entering into the sleep mode using the sleep command from main menu.

Thanks for any suggestions !
 
Recently I just upgrade to Leopard 10.5.8 using the 10.5.1 disk retail (archive and install) and adding the Time Capsule. Since this day I have an average 2/3 times KPs per week (mostly during the night when the machine is in sleep mode ?!? - the machine fans make big noise when it got KPs).

I have tried:
1) Reinstall a clean 10.5.8 (archive and install)
2) Eliminate all start-up items
3) Reset PRAM
4) Turn the Time Machine off
5) Do all tests with TechTool Deluxe and Rember (loop test on RAMs)
6) Repair permissions (using Disk Utility on the install disk) but there are always a lot of stuff... According to Apple you can safely ignore those messages:confused:

At this point I have 2 options in my mind:
a. Going back to Tiger :(
b. Looking a new Intel machine :confused:

One more thing: Since I upgrade to Leopard, it seems that the machine is not really in sleep mode (I still hear the fans). I have to force The machine entering into the sleep mode using the sleep command from main menu.

Thanks for any suggestions !

I bought my Mac Pro in June/July last year and all was well until I upgraded to Snow Leopard. I have since followed all the updates and
still would get a KP about once maybe twice a week. The panic always pointed towards DIMM 0 CANN 2 which is Slot 3.

Have dine all the memory testing and hardware testing and nothing has been thrown up, so assume that all my memory sticks are OK.
Somebody did mention it could be a specific memory call that causes the KP and maybe would,nt show up under test conditions.

I have since removed the mem stick in Slot 3 and for the last three weeks no KPs.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.