rMBP throttling and overheating

Discussion in 'MacBook Pro' started by theSeb, Dec 24, 2013.

  1. theSeb, Dec 24, 2013
    Last edited: Dec 24, 2013

    theSeb macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #1
    Apologies for the dramatic title, but I have always found this to be an interesting topic. A week does not go by without a topic created by a concerned owner seeing his Macbook Pro hitting around 100 degrees Celsius under load, which is pretty much the max operating temperature, as set by Intel.

    There are two camps that always reply in these threads. One says that it is ok, since the CPU is designed to handle these temperatures and as long as it does not shutdown, the Mac is not overheating. The other camp says that running so close to maximum operating temperature could impact reliability and also performance.

    I have been fascinated by the discussion, but wanted to bring something tangible and empirical to the table. I have in the past tried to find ways to be able to see the current CPU clock rate in OS X, with various forms of success. By installing kexts and using FreeSMC with plugins, I managed to do this on some older Macs, like my 2011 MBA, but trying the same technique would cause the rMBP to crash. I then recently discovered a very handy little tool - Intel Power Gadget. Do note if you run it yourself, then watching it does not give the most accurate results. It is advisable to use the output to a file functionality of the tool instead.


    In addition, I have been aware of the pmset terminal command, however I never really spent much time looking at the output properly until now.

    Code:
    pmset -g thermlog
    The output when the CPU is “idle"
    Code:
    Note: No thermal warning level has been recorded
    24/12/2013 17:41:54 GMT  CPU Power notify
    CPU_Scheduler_Limit  = 100
    CPU_Available_CPUs  = 8
    
    CPU_Speed_Limit  = 100
    
    Keep the window open and run a CPU intensive application, like Handbrake. You will see notifications popping up like

    Code:
    24/12/2013 13:52:53 GMT       CPU_Scheduler_Limit      = 100
         CPU_Available_CPUs      = 8
         CPU_Speed_Limit      = 85
    I found this interesting and wanted to see if there is a simple formula to work out the clock rate using CPU_Speed_Limit. I found that it was as simple as it could be. If we assume that CPU_Speed_Limit is just the percentage, we can work out the current clock rate

    I ran a Handbrake encode and piped pmset -g thermlog output to a file. I also enabled logging from Intel Power Gadget. I then analysed both logs and correlated the events.

    In the case of a workload taxing all 4 physical cores 100% CPU should be 3.4 GHz (Turbo boost on this CPU is 0.8 GHz with 4 cores active)


    I confirmed that the formula works by checking a couple of data points against the output from Intel Power Gadget

    Code:
    24/12/2013 13:52:55 GMT       CPU_Scheduler_Limit      = 100
         CPU_Available_CPUs      = 8
         CPU_Speed_Limit      = 82
    3.4 x 0.82 = 2.788 (rounded up = 2.8 GHz) which corresponds to the log from Intel Power Gadget.

    Then we see this

    Code:
    24/12/2013 13:52:56 GMT       CPU_Scheduler_Limit      = 100
         CPU_Available_CPUs      = 8
         CPU_Speed_Limit      = 85
    3.4 x 0.85 = 2.89 (rounded up to 2.9 GHz)

    So at 13:52:56 we expect to see a change from 2.8 to 2.9, which we do

    Code:
    13:52:56:110     2800
    13:52:56:160     2800
    13:52:56:209     2800
    13:52:56:259     2800
    13:52:56:310     2900
    13:52:56:360     2900
    13:52:56:410     2900
    13:52:56:460     2900
    13:52:56:510     2900
    13:52:56:560     2900
    13:52:56:610     2900
    13:52:56:659     2900
    Then from this point until 13:52:59 it ran at 2.9 GHz according to Intel Power Gadget. Correspondingly there were no events in pmset until this

    Code:
    24/12/2013 13:52:59 GMT       CPU_Scheduler_Limit      = 100
         CPU_Available_CPUs      = 8
         CPU_Speed_Limit      = 88
    So calculating what to expect in Intel Power Gadget

    3.4 x 0.88 = 2.992 (rounded up 3 GHz)

    Here is the corresponding Intel Power Gadget log

    Code:
    13:52:59:760     2900
    13:52:59:810     2900
    13:52:59:859     2900
    13:52:59:909     3000
    13:52:59:959     3000
    13:53:00:010     3000
    So what happened during the test?

    [​IMG]

    The Handbrake encode had 3 different clips in the queue. For the purposes of average/min/max I removed the statistics when one encode finished and the next started to not skew the data (you can see those stop/start events in the chart above clearly)

    Encode 1
    Code:
    Average Frequency (MHz) 2859 
    Max Frequency (MHz) 3300 
    Min Frequency (MHz) 2600 
    Average Temperature (Celsius) 101 
    Max Temperature (Celsius) 105 
    Min Temperature (Celsius) 89  
    
    
    Encode 2
    Code:
    Average Frequency (MHz) 2727 
    Max Frequency (MHz) 2800 
    Min Frequency (MHz) 2500 
    Average Temperature (Celsius) 101 
    Max Temperature (Celsius) 104 
    Min Temperature (Celsius) 96  
    
    
    Encode 3
    Code:
    Average Frequency (MHz) 2678 
    Max Frequency (MHz) 2800 
    Min Frequency (MHz) 2500 
    Average Temperature (Celsius) 101 
    Max Temperature (Celsius) 105 
    Min Temperature (Celsius) 96  
    
    
    We see the average frequency going down over time. Max clock rate (3.4) was only hit right at the beginning of an encode and those data points were deleted, as I mentioned already

    Conclusions

    In the past Apple has sometimes run CPUs at lower than base clock to reduce internal temperatures (all the time). Intel makes no guarantees about turbo boost - in other words, you should expect at least a minimum of the base clock rate under load, which is what we mostly see in this experiment. Thermal throttling does happen, but even on a thermally constrained laptop, we only see this happening very seldom and only up to a small number below the base clock rate.

    The operating system believes that everything was running fine with no abnormal temperatures recored, even though we were hitting 100-105 degrees Celsius. Pmset confirms this:

    Code:
    Note: No thermal warning level has been recorded
    It is possible that with better cooling Turbo Boost could be sustained for longer periods.

    A side note about Geekbench

    Armed with all of these tools and knowledge we can look at benchmark applications, like Geekbench. It has become the most popular and often quoted benchmark on these forums. People use it for various arguments. Intel has been concentrating a lot on the performance of the mobile CPUs and has, arguably, been ignoring the desktop performance. We see this trend in Geekbench, with Mobile CPUs in the Macbook Pros snapping at the heals of the top of the range iMac and the iMac snapping at the heal of the Mac Pros. However, as I have always said Geekbench is a sprint and shows mainly how fast the CPU is with maximum Intel Turbo Boost. It does not show how well the system will perform in real use.

    The problem is that pmset -g thermlog clearly shows that when we run Geekbench, we receive no events, therefore the CPU is running at 100% during this time with no speed limit. Geekbench’s workload is simply not strenuous enough to see the impact of thermal throttling and makes comparisons skewed towards CPUs with the maximum Turbo Boost clock rate. Basing purchasing decisions on this is not a good idea in my opinion.
     
  2. MacSumo macrumors regular

    Joined:
    Nov 26, 2013
    #2
    Thanks for the data. Now there is more (concrete) evidence that the rMBP is indeed throttling and overheating, as that's what it's exactly doing when under load.

    In conclusion, the rMBP:

    -overheats
    -throttles the CPU when actually under load (they should just market the i7 as a 2ghz chip)
    -suffers from yellow tinting
    -suffers from display uniformity issues
    -suffers from bad body design making the keyboard warm to touch, and hot to touch if near the number row and below the vent area.
     
  3. theSeb, Dec 24, 2013
    Last edited: Dec 24, 2013

    theSeb thread starter macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #3
    I disagree. The CPU is running mainly as it should be. It is not limited to running at 2 GHz under load. The data proves that.

    Completely irrelevant to this. You have other threads to discuss this, so let's try and stick to the topic. :)

    I disagree with this completely. My 2009 MBP had this issue and it would make typing uncomfortable. The current MBPs do not suffer from this, based on what my hands tell me and what the temperature sensors report.

    [​IMG]
     
  4. Swampus macrumors 6502

    Swampus

    Joined:
    Jun 20, 2013
    Location:
    Winterfell
    #4
    MacSumo, did you actually read this post?

    theSeb, very thoughtful and well-executed experiment. Thanks for taking the time to share in such detail.
     
  5. MacSumo macrumors regular

    Joined:
    Nov 26, 2013
    #5
    Look at encode 3 - average frequency is almost down to 2.5ghz, and would likely go lower over time. If it's this bad right now, it's not going to magically get better after further wear.

    What part of the CPU throttling to not exceed 100C is hard to agree with? That's throttling. It's done because the CPU will shut off. I define this as overheating. Your CPU die is at 102C. That's okay with you? I wouldn't accept that. That's like driving a car near its max tolerable temperature and claiming it's not overheating since it isn't on fire yet.

    The rMBP is warm to touch (keyboard). You may disagree, but I tested over a dozen of these laptops (15" rMBP best spec).
     
  6. theSeb thread starter macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #6
    I'll test again, but this time will encode a much longer BR mkv to see what happens over longer period. As long as the CPU frequency remains close to the base, then I would not say that the CPU is being dramatically throttled. I think you would be hard pressed to find a laptop that allows turbo boost to run at max frequency. It would be interesting to compare with a competitor in a similar test.
     
  7. Quu macrumors 68020

    Quu

    Joined:
    Apr 2, 2007
    #7
    I'm really happy with the performance of mine no complaints under heavy load. It's fast fast fast.
     
  8. sjinsjca macrumors 68000

    sjinsjca

    Joined:
    Oct 30, 2008
    #8
    A key bit of missing information is the ambient room temperature in which the tests were performed.

    This time of year, I heat my house to 64-66, maximum. We are used to this and find anything warmer to be stifling. So I'm uncomfortably aware that many/most people and businesses (and Lord knows, airlines) heat to 72-74 degrees or even more.

    I'd wager that a given machine will throttle less in my house than in some of those other places!
     
  9. xxcysxx macrumors 6502

    Joined:
    Oct 12, 2011
    #9
    I have the mid 2012 rmbp and I can verify for what MacSumo said that the number row area gets hot to the touch. though the keys does not get hot to the touch but the aluminum gets very hot and feel quite warm when hovering my fingers above the humber keys area.
     
  10. theSeb thread starter macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #10
    Heated house. 21 degrees Celsius.
     
  11. Swampus macrumors 6502

    Swampus

    Joined:
    Jun 20, 2013
    Location:
    Winterfell
    #11
    Good point, and even when trying to control for as many variables as possible, you'd probably still get some variations from machine to machine. Still, this was a neat experiment.
     
  12. leman macrumors 604

    Joined:
    Oct 14, 2008
    #12
    Thanks for running the tests! I am looking forward for a seeing results with a longer stress phase. I assume you have the i7-4960HQ model? Would be also interesting to test other CPUs of the same model, after all, all CPUs are different...

    You are being quite stubborn and silly about this. The data clearly shows that the CPU runs at its nominal frequency or above most of the time. That is not throttling. Sure, there is not enough thermal headroom for Turbo Boost, but that should have been expected.
     
  13. theSeb thread starter macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #13
    Yes, it would be fascinating to repeat this for a larger selection of CPUs. The model under test is actually the 2012 rMBP with i7-3720qm
     
  14. john123 macrumors 68000

    john123

    Joined:
    Jul 20, 2001
    #14
    Wow. You took an impressive, thoughtful, empirically rigorous set of data and completely distorted it to suit your (now-proved-invalid) agenda. Wow.
     
  15. ha1o2surfer macrumors 6502

    Joined:
    Sep 24, 2013
    #15
    I'm glad you took the time to find out this information. This excessive throttling gets worse when the dGPU is activate (if your Macbook has one) since the heatsinks are combined.

    I think the next test to try is take off the bottom cover and try to cool it as much as possible to avoid throttling and create a baseline. Then see how much it hurts performance when it's throttling.

    ----------

    I would be happy to run some tests on a similar CPU, a 3840QM (100mhz higher base and boost clock). I run full Turbo boost without throttling + massive overclock on a machine that is much much smaller; A W110ER. Don't get me wrong, I don't want to start a war. I like clear cut data like this and it's easier for people to see with a baseline.

    I also will run Intel XTU which will show when the CPU is throttling. I will create a throttling scenario to show as well.
     
  16. ha1o2surfer, Dec 24, 2013
    Last edited: Dec 24, 2013

    ha1o2surfer macrumors 6502

    Joined:
    Sep 24, 2013
    #16
    Here are my results for the baseline. Please keep this in mind, this is not a Macbook Pro but this computer has a very similar CPU with 100Mhz+ Turbo Boost and Non Turbo clocks. This is just to show how the CPU responds when not throttling and when throttling/in which temps it starts to throttle. Below is a photo of my cooling system setup.

    [​IMG]


    These next Two images show Intels XTU running with some logging enabled.

    This Image shows my CPU throttling. (I had to hold the fan back and block the vents to get this much throttling)

    [​IMG]

    The max temp hit was 205F and the lowest clock it hit was 3.2ghz which ironically is still turbo boosting lol but indeed throttling.

    This image shows a CPU that is being properly cooled.

    [​IMG]

    This is showing the wattage draw of the CPU. Notice at it's peak it used 60Watts!!
    [​IMG]


    When the Macbook Pro throttles into the 2.xx Ghz range, the TDP drops to around 25-30watts. as shown below... C0% means CPU is experiencing full load
    [​IMG]

    So we can conclude that the heatsink is being designed to dissipate heat at the CPU's non turbo boost clock which is actually below Intel's spec. Maybe I am wrong and if I am, if someone has some more information on the cooling system requirements Intel's has on mobile CPUs I would love to know!! :) I know tcase is not the same as tjunction. I am having trouble finding Intel's official explanation on how tjunction is handled so this will have to do for now. (as seen below)

    [​IMG]
    Link to Document: http://www.intel.com/content/dam/doc/white-paper/resources-xeon-measuring-processor-power-paper.pdf

    I would like to run the similar tests that OP did but I don't have the same 3 video files he used. But that shouldn't matter really, we are just talking about a CPU under load which is pretty generic. I will post videos below for anybody who is interested in what I did to load up the CPU and see the graph and data happen in real time.

    As far as performance hits goes, the throttling videos impacts LinX by around 20-30Gflops. I usually get around 90Gflops without throttling so I would call that a noticeable hit in performance. Dropping it down to 2.5Ghz gets me 55glops.

    3.4GHz: 85Gflops (Stock CPU Performance)
    3.7GHz: 93Gflops
    2.5GHz: 55Gflops


    Anyways, I have nothing against Apple products and don't tell my Boss this but i'd buy one if they fixed these issues.. (he loves Macs and I love PC software) I love their iPads, iPhones, Mac OSX and would recommend them to family and friends but I, personally, just can't use one until I know I can use the performance I paid for.

    EDIT: Videos

    Throttling Video: http://www.youtube.com/watch?v=kJC2GzYqG28&feature=youtu.be
    Non Throttling Video: http://www.youtube.com/watch?v=rYDhVzH7Okg&feature=youtu.be
     
  17. niblet macrumors newbie

    Joined:
    Dec 24, 2013
    #17
    My rMBP does get warm, when using CPU and GPU intensive tasks, the body gets very warm, and the keyboard does too, but the keyboard has never been "hot to the touch" in my experience, even when the CPU is running all four cores at 95 degrees celsius.
     
  18. AirThis macrumors 6502a

    Joined:
    Mar 6, 2012
    #18
    Nice little experiment you did there. I tried on my side and obtained some interesting results. I'm using a 3720QM clocked at 2.6Ghz.

    To test if your hypothesis was globally correct, I took a Blender model with a high mesh density and then set the catmull clark subsurface render/view levels to 5. I then checked Blender's memory usage to ensure that the subsurface modifier had effectively interpolated a large number of vertices. Blender's memory footprint jumped from 600 Mb to 10.5 Gb, thereby confirming that my model was going to require much more computational resources than usual.

    After that, I opened 2 viewports and rendered them simultaneously with Cycles. After letting Blender run for 20 minutes, I checked the CPU idle time, user time, and individual CPU usage of Blender. The idle time was consistently 0.0%, showing that the CPU resources were being driven to the limit. The CPU user time was consistently over 90% and the individual CPU usage of Blender was consistently over 750%.

    I monitored global CPU stats like this:

    top -l 30000 | grep -i "CPU usage"

    I monitored Blender's individual CPU usage with this:

    top -s5 -o cpu -n 15 -l 30000 | grep -i blend

    Both commands gave me readings at regular intervals.

    SMC Fan Control was reporting a temperature of 103C, and the fans were spinning at 5500 and 5900 RPM respectively. The values I got for CPU_Speed_Limit were an alternation of 94s and 97s. This was extremely consistent, even after running the test for 1 hour. After that, I decided to encode a 720p video with Handbrake (in addition to the render in progress), but CPU_Speed_Limit still only jumped down to values ranging between 91 and 94.

    If we use the calculations given above, we obtain a machine running at a solid 3.2Ghz worst case scenario. The Turbo boost for the 2.6Ghz processor is 3.6, so we obtain the following:

    3.6 x 0.91 = 3.276 Ghz

    For the Blender only test, we had a lowest value of 94, and so the result is closer to 3.3 Ghz. This is confirmed by the Intel Power Gadget as shown in the illustration below. More than anything else, I'm extremely impressed by the results. We see minor throttling on a test which is probably one of the most computationally heavy tasks you could throw at a computer, games excepted. And the keyboard stayed luke warm during the entire test. Earlier I ran the exact same test on my cousin's Dell Precision Mobile Workstation and it was definitely hot to the touch.

    What I'm wondering here is how you managed to get lower values for the CPU_Speed_Limit test. What types of files did you encode?
     

    Attached Files:

  19. Doward macrumors 6502a

    Joined:
    Feb 21, 2013
    #19
    More data to add to this discussion:

    Yes, your MacBook Pro is running too hot

    30 Days Post Lapping/Arctic Silver 5

    You've found mostly the same information I had:

    1. Apple's thermal profile for their laptops could be improved

    2. By keeping temperatures lower, Intel's turbo boost is allowed to run at a higher clock rate, delivering higher performance, for longer

    3. Post lapping / AS5, no machine I've performed this 'surgery' on has had any issue with maintaining a 'turbo boost' situation above stock frequency for any length of time. As an example, my own 2.5Ghz system maintains 2.8-3.0ghz indefinitely (have tested over 3 hours encoding video).

    4. This thermal deviation (due to poor thermal transfer from CPU/GPU die to the heatsink assembly) is also a very likely cause for the dead logic boards found after a few years in many MacBook Pros. Increasing the thermal transfer rate to the heatsink allows for a more even distribution of heat across the logic board, reducing the strain on the solder joints.

    Now, I understand you are stating that your results show that throttling does happen, but not often.

    I maintain my contention that running 100% load, at stock / slightly under stock frequencies of 100-105C is simply not healthy for the system when compared to running up to 20% faster than stock (maintaining Turbo Boost) with a maximum temperature under 90C. Most systems I've worked on maintain under 85C under maximum load, but what can I say - the 2.5Ghz Sandy Bridge was no cool running chip ;)

    All in all, nice work, theSeb!
     
  20. sjinsjca macrumors 68000

    sjinsjca

    Joined:
    Oct 30, 2008
    #20
    Yes, agreed. Just, let's be clear that there are un-obvious contributors. In that vein, I should add: altitude will make a difference as well. Your laptop will run hotter, spin up its fans sooner and throttle earlier at cruise in an airliner than it will at sea level at the same ambient temperature, because the air is thinner.

    One thing I'd like to see: if one spins up the fans preemptively via a utility such as smcFanControl (https://www.macupdate.com/app/mac/23049/smcfancontrol), how much does that extend the unthrottled operation of the machine? That might be valuable to know and would speak to the thermal conductivity and efficiency of the CPU/GPU packaging and heat sink arrangements. Perhaps spinning up the fans prophylactically ahead of a big CPU/GPU-intensive task might help it finish faster?

    Another issue: Apple is often criticized for putting too much (or the wrong kind of) thermal paste between the CPU/GPU and the heat sinks. [EDIT: I see Doward has raised this issue here.] Some folks go in (at risk to their warranty, I'd imagine) and skim off the excess and/or replace the paste with another type, then bolt everything back together, with reported profound benefits in CPU temperature. So... how does that impact the heating-and-throttling behavior in severe tests such as these?

    ...So many questions... ;-)
     
  21. ha1o2surfer macrumors 6502

    Joined:
    Sep 24, 2013
    #21
    Your machine seems to be performing better than most Macbook Pros out there. Very nice!! Notice the power draw is around 40 watts , that is a good sign!
     
  22. Doward macrumors 6502a

    Joined:
    Feb 21, 2013
    #22
    I'll be happy to answer that question.

    Attached are screenshots and the Intel Power Gadget log of a 30 minute encode I just completed.

    My 2.5Ghz Quad Core i7 Sandy Bridge maintained and NEVER dropped below 100% CPU_LIMIT aka 3.30Ghz in my case.

    In other words, yes, profound effects, and the impact is immense.

    Over a long encode, my 3 year old Sandy Bridge is faster than a brand new 2013 system.
     

    Attached Files:

  23. john123 macrumors 68000

    john123

    Joined:
    Jul 20, 2001
    #23
    Look, I think you did excellent empirical work in your thread, but I'm still not seeing any evidence to support this claim. Is there anything tangible you can provide here, or is this really just speculation?
     
  24. theSeb, Dec 26, 2013
    Last edited: Dec 26, 2013

    theSeb thread starter macrumors 604

    theSeb

    Joined:
    Aug 10, 2010
    Location:
    Poole, England
    #24
    I am using the exact same model CPU. The turbo boost is only 3.4 when all 4 cores are active.

    Here is on old post of mine about Turbo Boost

    I was using fairly typical clips for the encode (576p). However, I did have a couple of apps running the background, like Chrome. I am going to try again with a 1080p BR quality clip, but with nothing running in the background, except the monitoring stuff

    Can you pipe your pmset -g thermlog output to a file and share it? It would be interesting to analyse.

    Edit to add: I simply cannot reproduce the much better results that you have posted. Since we're using the same CPU, the only other conclusion is that your thermal paste was applied better than mine.

    I suppose I could reset the SMC and try again, or I could resort to Intel HD graphics (I run with the discrete 650M at all times). That should make a difference, especially since I see your CPU uses more watts.
     

    Attached Files:

  25. ha1o2surfer macrumors 6502

    Joined:
    Sep 24, 2013
    #25
    His CPU could be using more power because, as you stated, you use your dGPU full time at the adapter is only 85 watts. I would try it with your dGPU off. this makes the temps way worse.
     

Share This Page