Much has been said here about the performance (or lack thereof) of the 2016 MBP models. Many tests (e.g. Geekbench, which now sadly became the golden standard of Mac performance benchmarks) suggest that the Skylake CPUs are barely faster than the older Haswells, and in some cases — even slower. As I have now finally received my new 2016 model to replace the late 2015 one, I was curious about whether there will be any difference in the CPU performance for what I do (spoiler: oh boy, it does, and how). As I spend most of my time doing statistical simulations in R, that appeared to be a good domain for testing. Therefore, I ran a computationally intensive randomised statistical test and evaluated the results. Read below for details. Machines: I am testing a mid 2015 MBP equipped with the 2.5Ghz i7-4870HQ CPU and a late 2016 MBP equipped with the 2.9Ghz i7-6920HQ CPU. Yes, this is an old mid-tier versus the newest top-tier, but thats what I have. Of course we expect the 6920HQ to be faster. However, if we look at Geekbench results, the difference between the two CPUs doesn't appear to be that large — in the ballpark of 15% for single core scores and 7% for multi-core scores. Also, the max turbo boost of both CPUs is fairly comparable (difference of 100Mhz only). Test: I ran a simple randomised chi squared test with 50000 monte carlo replicates in R. The implementation of this test is single-threaded, so it uses only one core and therefore evaluates the single-core performance. To estimate multi-core performance, I have run 4 copies of this test at the same time, using 4 different cores. Doing this of course won't complete the task faster, but it should show the penalty the CPU imposes for doing heavy multi-threaded work — as we know, if all cores are loaded, the maximal clock will be lower in comparison to when only one core is loaded. Basically, when running the same task on 4 cores, we expect each of the cores to be slightly slower than when running it on a single core only. But of course, at the same time, the 4 cores will do a much larger amount of work in the same timespan. Note: this test is fairly close to what will be used in real-world analysis. Despite its simplicity, it involves complex memory access patterns, cache mismatches, branch mispredictions etc. Also, it only affects the CPU and RAM subsystem. Disk speed and GPU are not performance factors. This work is very different from image processing (what most other tests seem to focus on), because it does not involve linear memory access and highly optimal vector code. Methodology: both computers ran the same version of R and OS X, they were in the same room on the same desk (control for ambient temperature). WIFI and Bluetooth were off. Other apps were closed. Each test (per CPU and single/multi-core) was run 100 times in succession. This was to make sure that we get reliable results and also see whether there is throttling going on. The random number generator was reset to a constant seed before each run to make sure that every run does the same work. I did not monitor the temperature or the CPU clocks. Results: the graph with runtimes is attached. First of all, the 2016 system run the entire benchmark in 1718 seconds, while the 2015 system run it in 2158 seconds. In other words, the 2015 system took 1.26 times longer to perform the same task. The single-core time for the 2016 system is 8.08s on average and for the 2015 system 9.92s on average. The 2015 system therefore needs 1.23 times as long to run the single-threaded R randomised chi-squared test on average. Results for both CPUs show some minor fluctuations between the runs. The density distributions of times appear to be normal mixtures, with a major component around the sample mean and some small significantly slower runtimes — nothing alarming here. Multi-threaded results are very interesting. First, as expected, the times are generally slower of all 4 cores are utilised. Thats because the CPU can't maintain high turbo if all cores are loaded. On average, Skylake needs 9.03 seconds per core per run in this scenario, while the Haswell needs 11.58 seconds. The full-load penalty is thus 1.12 times as long for Skylake and 1.17 times as long for Haswell on average, compared to time needed in a single-core scenario. In addition, Haswell on average needed 1.28 as much time in multi-core scenario than the Skylake. Discussion: for this particular task, the top-tier Skylake outperforms the mid-tier Haswell by a factor of 1.2-1.3. This is a very substantial increase in performance. While it might appear unfair to compare mid-tier to top-tier, I'd like to note that the measured performance differences are much more dramatic than what synthetic mixed tests such as Geekbench suggest. I would guess that the difference between the same-tier CPU is still most likely to be in the ballpark of 1.1-1.2. In addition, Skylake is much more efficient in running code that takes advantage of multiple cores — it can maintain relatively higher turbo boost frequencies than the Haswell. In fact, with these two CPUs, the Skylake per-core performance in full 4-core load is still substantially higher than what Haswell manages in single-threaded mode. Also, neither laptops show any sign of substantial thermal throttling under prolonged load (there is some heavy fluctuation over the first few runs, I guess thats where the CPU tries to find its "comfortable" spot). Subjectivelly, the 2016 MBP was cooler and quieter over the entire ordeal. Conclusions: for CPU-based work with complex memory access and branching patterns, which goes beyond image processing (what other tests seem to focus at), the Skylake CPU offers some very noticeable performance improvements over the last years model. It is also more efficient when all cores are under heavy load. In addition, I have to say that I am very impressed by the new cooling system. With the laptop being impossibly thin I was worried that it would be prone to throttling under load. No chance. This thing runs relatively cool and quiet even if the CPU and GPU are heavily taxed, and does so for hours.