Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Homy

macrumors 68030
Original poster
Jan 14, 2006
2,771
2,803
Sweden
Intel seems to be in big trouble. Gamers, developers and server providers are switching to AMD and Intel is not fully acknowledging the problem. Software fixes aren't solving the problems.

Alderon Games statement

Skärmavbild 2024-07-13 kl. 19.00.21.png



"Out of the 1,584 problems found in the last ninety days, 1,431 were from CPUs made by Intel’s 13th and 14th generations, with AMD only showing up in 4 instances of identical errors."

"Regarding Oodle’s decompression mistakes, AMD CPUs accounted for 30% of the faults and Intel CPUs for 70%. This is already an issue on gaming PCs, but it’s a far bigger one on game servers."

 
Last edited:
  • Like
Reactions: Sikh
I am glad I am deep in $AMD. Did consider opening a position in $INTL last week, since they seemed to have bottomed, but this could turn out to be really really bad for them.

I don't think it will have such a big impact on client business, as we haven't heard anything about this before now. But servers, oh boy. They require close to full uptime, and AMD is beating them on performance, price, and perf/watt already. This may be the final nail in the coffin. AMD is currently at 25% in the server marketshare, up from 2% (!) in 2018. There's still a LOT to eat from Intel!
 
Last edited:
I am glad I am deep in $AMD. Did consider opening a position in $INTL last week, since they seemed to have bottomed, but this could turn out to be really really bad for them.

I don't think it will have such a big impact on client business, as we haven't heard anything about this before now. But servers, oh boy. They require close full uptime, and AMD is beating them on performance, price, and perf/watt already. This may be the final nail in the coffin. AMD is currently at 25% in the server marketshare, up from 2% (!) in 2018. There's still a LOT to eat from Intel!

Apparently the problems start after several months of usage so it can take a long time before people discover and can make a connection between the errors and the CPUs.
 
Yup, that series of Intel CPU are totally broken, since a year ago or even more we began receiving a lot of reports of HandBrake crashing, totally unreproducible. In the end it always turns out the CPU is one of the those affected Intel CPUs.
 
There's a couple of reasons for this. The big one is that Intel allowed motherboard manufacturers to exceed Intel's recommended specifications regarding CPU power draw, so there are cases where motherboards are (on paper) allowing unlimited wattage to the CPUs. That can lead to excessive heat generation, which is the mortal enemy of any computer system. Fortunately, this can be addressed by going into the BIOS/UEFI settings and reestablishing those limits.

Jayz2Cents - Motherboard Defaults May Be Cooking Your CPU

The second reason (in my opinion) is that Intel has completely botched their implementation of P and E cores, which is causing additional issues under the hood of the CPU itself that may not be fixable with a microcode update. Since the 14th gen parts were just spec bumps of the 13th generation parts, the internals have not changed significantly enough to change that implementation. This would also help explain why Arrow Lake will be pushed so heavily by Intel upon release.
 
Recently after a motherboard failure (5 years old) I upgraded to another Gigabyte board and put a 13th gen i5 into it. So far it’s running nicely. I don’t over clock. Cpu cores seem under control 40ish C. 😳

Update: Make that 50-60 ish…
 
Last edited:
There's a couple of reasons for this. The big one is that Intel allowed motherboard manufacturers to exceed Intel's recommended specifications regarding CPU power draw, so there are cases where motherboards are (on paper) allowing unlimited wattage to the CPUs. That can lead to excessive heat generation, which is the mortal enemy of any computer system. Fortunately, this can be addressed by going into the BIOS/UEFI settings and reestablishing those limits.

Jayz2Cents - Motherboard Defaults May Be Cooking Your CPU

The second reason (in my opinion) is that Intel has completely botched their implementation of P and E cores, which is causing additional issues under the hood of the CPU itself that may not be fixable with a microcode update. Since the 14th gen parts were just spec bumps of the 13th generation parts, the internals have not changed significantly enough to change that implementation. This would also help explain why Arrow Lake will be pushed so heavily by Intel upon release.
Have you watched Level 1 Tech's video on it? It doesn't seem like power limits are the problem and disabling E cores doesn't fix it either.

 
  • Like
Reactions: Sikh
Are the Cove cores busted too?
Seems like at least some of the chips have stability problems in general that seem to get worse over time. The cause doesn't seem to be completely clear but you're not immune from problems even with a server W680 board or disabling E cores.
 
The big one is that Intel allowed motherboard manufacturers to exceed Intel's recommended specifications regarding CPU power draw, so there are cases where motherboards are (on paper) allowing unlimited wattage to the CPUs. That can lead to excessive heat generation, which is the mortal enemy of any computer system. Fortunately, this can be addressed by going into the BIOS/UEFI settings and reestablishing those limits.
People running servers don’t run overclocked. Nor do boards meant for servers.
 
People running servers don’t run overclocked. Nor do boards meant for servers.

This has nothing to do with clock speeds or overclocking anything in the systems in question. For some reason, 13th and 14th gen motherboards often have default settings (often called optimized defaults) which allow for far more voltage than is recommended under Intel's own guidance, effectively removing both the power and current limits for the CPU. Where this causes issues is that when the CPU asks for more power, the motherboards will simply provide it, no questions asked. This is where the heat generation issues come from. The fix for many of these systems is to go into the BIOS and change those settings to Intel's defaults, which follow the power and current limits as designed. There are multiple motherboards from ASUS, MSI, ASRock, and Gigabyte which are known to have defaults which can cause these issues.
 
This has nothing to do with clock speeds or overclocking anything in the systems in question. For some reason, 13th and 14th gen motherboards often have default settings (often called optimized defaults) which allow for far more voltage than is recommended under Intel's own guidance, effectively removing both the power and current limits for the CPU. Where this causes issues is that when the CPU asks for more power, the motherboards will simply provide it, no questions asked. This is where the heat generation issues come from. The fix for many of these systems is to go into the BIOS and change those settings to Intel's defaults, which follow the power and current limits as designed. There are multiple motherboards from ASUS, MSI, ASRock, and Gigabyte which are known to have defaults which can cause these issues.
This probably isn't true for server boards and is discussed in the Level 1 Tech Video. 13900K and 14900K CPUs are compatible with W680 server motherboards. Server motherboards are not doing overclocking. They'd be insane to. And yet they're still seeing crashes.
 
  • Like
Reactions: Sikh and jdb8167
This probably isn't true for server boards and is discussed in the Level 1 Tech Video. 13900K and 14900K CPUs are compatible with W680 server motherboards. Server motherboards are not doing overclocking. They'd be insane to. And yet they're still seeing crashes.
I saw a tweet that talked about the issue probably being tied to ILM and Thermals.
 
  • Like
Reactions: Sikh
This probably isn't true for server boards and is discussed in the Level 1 Tech Video. 13900K and 14900K CPUs are compatible with W680 server motherboards. Server motherboards are not doing overclocking. They'd be insane to. And yet they're still seeing crashes.
Yeah. It appears that some want to excuse Intel’s practices despite years of obvious engineering problems.
 
  • Like
Reactions: Sikh
This probably isn't true for server boards and is discussed in the Level 1 Tech Video. 13900K and 14900K CPUs are compatible with W680 server motherboards. Server motherboards are not doing overclocking. They'd be insane to. And yet they're still seeing crashes.
Again, I'm not sure why you keep talking about overclocking when the issue I'm describing is related to lifting/removing the power limits, completely independently of (and separate from) clock speeds. I have no idea why Intel chose not to enforce their own guidelines (that's on them), but there's a reason a lot of the motherboards in question have had recent BIOS updates to address this issue specifically - that part is entirely on the motherboard manufacturers for allowing that to happen in the first place.
 
Every time I have used AMD I had nothing but problems. Both CPU and GPU. I might get an AMD CPU soon though and try again. I’m keeping NVIDIA for my GPU.
 
Again, I'm not sure why you keep talking about overclocking when the issue I'm describing is related to lifting/removing the power limits, completely independently of (and separate from) clock speeds. I have no idea why Intel chose not to enforce their own guidelines (that's on them), but there's a reason a lot of the motherboards in question have had recent BIOS updates to address this issue specifically - that part is entirely on the motherboard manufacturers for allowing that to happen in the first place.
What exactly do you think the increased power limits are doing? The consumer motherboards are giving the CPUs more power to go faster.
 
Every time I have used AMD I had nothing but problems. Both CPU and GPU. I might get an AMD CPU soon though and try again. I’m keeping NVIDIA for my GPU.

I have had relatively no issues with my AMD system, other than self-inflicted issues caused because it was the first PC I had built in almost 15 years.
 
  • Like
Reactions: Ethosik
What exactly do you think the increased power limits are doing? The consumer motherboards are giving the CPUs more power to go faster.

When the processor asks for more power, the motherboards are delivering more power. Keep in mind that it's the motherboard defaults which are allowing this, and the defaults are only increasing power limits, not the core clock speeds or clock multipliers. The CPUs may be trying to run at their peak (burst) speed more often, but that's still not overclocking the processors.
 
When the processor asks for more power, the motherboards are delivering more power. Keep in mind that it's the motherboard defaults which are allowing this, and the defaults are only increasing power limits, not the core clock speeds or clock multipliers. The CPUs may be trying to run at their peak (burst) speed more often, but that's still not overclocking the processors.
Intel Turbo Boost is basically Intel sanctioned automatic overclocking. They don't even guarantee you can hit the boost freq. Increasing the power limit to increase boost time so the CPU is faster than the chip maker specified is de facto overclocking. Core clock speeds are incredibly meaningless these days since CPUs throttle down for idle and boost for workloads. You can't tell me with a straight face that a PC that has had it's "core" clock speed set to be 100 MHz faster but still limits how long all core boost is active is overclocked vs. a PC that had it's core clock speed left alone but power limits increased to make sustained all core boost possible is not.

Overclocking does not just mean changing the clock multipliers directly.

And again, it doesn't seem to be power limits because the W680 boards don't seem to do that.
 
  • Like
Reactions: jdb8167
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.