Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

whitedragon101

macrumors 65816
Original poster
Sep 11, 2008
1,349
339
The quoted figure for broadwell is a 40% graphics performance improvement over haswell. However this little titbit has been floating around the internet :

"At 1GHz,broadwell is projected to have 2 TFLOP of shader performance. Haswell's 40EU (160 ALUs) GT3/GT3e, by comparison, has 832 GFLOP of shader performance (SP) at 1.3GHz."

http://hardware.forumsee.com/a/m/s/p12-29521-0104887--broadwell-gt4-info.html

That seems pretty amazing. Not sure though exactly how to read this. Does shader performance = total performance? i.e 2tflops vs 832gflops = +240% . Is this a 240% performance improvement or are there other areas of the GPU for tessellation, physics etc that could effect frame rate?
 
That seems pretty amazing. Not sure though exactly how to read this. Does shader performance = total performance? i.e 2tflops vs 832gflops = +240% . Is this a 240% performance improvement or are there other areas of the GPU for tessellation, physics etc that could effect frame rate?

This is a very complex topic... First of all, the total shader performance is a theoretical peak of computation throughput. For example, the Haswell CPU is also capable of several hundred GFLOPs - but this is using a specific instruction under specific circumstances - no real world algorithm can achieve such efficiency. Second, the weak point of the iGPU is the memory bandwidth. Again, the iGPU used in Haswell already are on par or even more powerful than the current mid-range GPUs like the 750M - but paired with slow DDR3 which they need to share with the CPU, the real-world performance quickly falls down. You can see it very well on the benchmarks - the iGPU does a very good job at lower resolutions and then quickly degrades once the resolution increase (higher resolutions - more pixel data to copy). The L4 eDRAM cache helps, but it does not solve the problem.

So yes, its quite difficult to see what the impact will be. I agree with VanillaCracker that 30%-50% sounds realistic, depending on how memory bandwidth intensive the task is. For computation though (OpenCL), this will be a beast. If Broadwell can indeed deliver that performance without increasing the thermal envelope - and, if Intel will be able to further improve the memory problem, this could indeed spell out the death for mid-range mobile graphics. The low-end mobile graphics is already dead.
 
This is a very complex topic... First of all, the total shader performance is a theoretical peak of computation throughput. For example, the Haswell CPU is also capable of several hundred GFLOPs - but this is using a specific instruction under specific circumstances - no real world algorithm can achieve such efficiency. Second, the weak point of the iGPU is the memory bandwidth. Again, the iGPU used in Haswell already are on par or even more powerful than the current mid-range GPUs like the 750M - but paired with slow DDR3 which they need to share with the CPU, the real-world performance quickly falls down. You can see it very well on the benchmarks - the iGPU does a very good job at lower resolutions and then quickly degrades once the resolution increase (higher resolutions - more pixel data to copy). The L4 eDRAM cache helps, but it does not solve the problem.
That isn't really true. A lot of these benchmarks did increase the detail settings along with the resolution and you cannot really conclude if it is the settings or the resolution that troubles Iris Pro.
My own testing on Starcraft 2 suggest that Iris Pro has no issue with high resolutions or with high res textures. There seems to be plenty of ROP performance and textures are mostly VRAM dependent but they are fed into the GPU fast enough. Raising the resolution at the same medium settings leads to the same relative performance degradation on each card (the 750M is always about 30% faster). It does not suddenly drop lower, which it should if there was a bottleneck.
Some shader settings take a bigger hit on the Iris Pro than on the 750M. Resolution doesn't appear to be an issue. Just some higher quality settings Nvidia with their architecture and driver is just more efficient at handling.
It has been the same with AMD. Some settings take a bigger or smaller hit one AMD or Nvidia GPUs and also on GPUs from each company from different generations.
 
This is a very complex topic... First of all, the total shader performance is a theoretical peak of computation throughput. For example, the Haswell CPU is also capable of several hundred GFLOPs - but this is using a specific instruction under specific circumstances - no real world algorithm can achieve such efficiency. Second, the weak point of the iGPU is the memory bandwidth. Again, the iGPU used in Haswell already are on par or even more powerful than the current mid-range GPUs like the 750M - but paired with slow DDR3 which they need to share with the CPU, the real-world performance quickly falls down. You can see it very well on the benchmarks - the iGPU does a very good job at lower resolutions and then quickly degrades once the resolution increase (higher resolutions - more pixel data to copy). The L4 eDRAM cache helps, but it does not solve the problem.

So yes, its quite difficult to see what the impact will be. I agree with VanillaCracker that 30%-50% sounds realistic, depending on how memory bandwidth intensive the task is. For computation though (OpenCL), this will be a beast. If Broadwell can indeed deliver that performance without increasing the thermal envelope - and, if Intel will be able to further improve the memory problem, this could indeed spell out the death for mid-range mobile graphics. The low-end mobile graphics is already dead.
Right, and hopefully DDR4 will help do the trick, or at least narrow the gap to the point where it doesn't really matter when paired with the L4 cache.
 
Bottom line - whether by physics or shrewd business sense, broadwell isn't going to make Haswell feel much more than a single generation older.
 
Is it confirmed we will get DDR4 with broadwell ?
No Broadwell won't have DDR4. It is still DDR3. Intel intends to include an eDRAM on more CPUs and probably many of the lower power dual cores.

Many other sources about that GT4 Broadwell also say that this GT4 is only for all in one PCs. Stuff like the iMac not for mobile notebooks. It will probably have a 65W or higher TDP.
The GT3 for mobile will probably stick around more reasonable computing performance. Usually you get 30% efficiency gains with a new process. A bit of Arch improvements and they hit 40% easily. Gains will be bigger on all the GPUs that used to come without eDRAM but for Iris Pro in a 47W Quad I wouldn't keep my hopes up for a 2TFLOPS GPU. BTW that would be more than the PS4.
 
Bottom line - whether by physics or shrewd business sense, broadwell isn't going to make Haswell feel much more than a single generation older.

You make a strong point here. In the end, how often do we really see "ground breaking" improvements/updates, that really make the previous generation "obsolete"... almost never.

You make me want to stop waiting for what's next and jump on a Haswell rMBP :'(
 
You make a strong point here. In the end, how often do we really see "ground breaking" improvements/updates, that really make the previous generation "obsolete"... almost never.

You make me want to stop waiting for what's next and jump on a Haswell rMBP :'(

I hear ya man. But yeah if the market is willing to pay for 10% improvement, why develop 50% improvement at the same or greater cost?

Either way, if there's a killer feature on the next rMBP, I'll sell my Haswell to get it.
 
I hear ya man. But yeah if the market is willing to pay for 10% improvement, why develop 50% improvement at the same or greater cost?

Either way, if there's a killer feature on the next rMBP, I'll sell my Haswell to get it.

Well there's also the issue that if they bring out that 50% improvement and someone writes software to take advantage of it, they're cutting themselves off from the huge majority who won't upgrade their hardware for a few years regardless.
 
Well there's also the issue that if they bring out that 50% improvement and someone writes software to take advantage of it, they're cutting themselves off from the huge majority who won't upgrade their hardware for a few years regardless.

Absolutely.. all about value, demand, and profit.
 
I hear ya man. But yeah if the market is willing to pay for 10% improvement, why develop 50% improvement at the same or greater cost?

Either way, if there's a killer feature on the next rMBP, I'll sell my Haswell to get it.

You've got a Haswell Mac(Book)? It's early in Europe, but I'm pretty sure your signature doesn't show anything about anything Haswell. What did you buy? :eek:

Well there's also the issue that if they bring out that 50% improvement and someone writes software to take advantage of it, they're cutting themselves off from the huge majority who won't upgrade their hardware for a few years regardless.

Very good point!
 
You've got a Haswell Mac(Book)? It's early in Europe, but I'm pretty sure your signature doesn't show anything about anything Haswell. What did you buy? :eek:



Very good point!

It's an early Christmas present that I (had to) know about. :D :D

2.3/512/16/750m
 
Again, the iGPU used in Haswell already are on par or even more powerful than the current mid-range GPUs like the 750M

What?

Uhm... no... please...

Iris Pro has a hard time catching up to 650M with DDR3 memory.

So it's not a memory bandwidth issue. It's an architecture and driver issue.

I thought we've been over this before...
 
The quoted figure for broadwell is a 40% graphics performance improvement over haswell. However this little titbit has been floating around the internet :

"At 1GHz,broadwell is projected to have 2 TFLOP of shader performance. Haswell's 40EU (160 ALUs) GT3/GT3e, by comparison, has 832 GFLOP of shader performance (SP) at 1.3GHz."

http://hardware.forumsee.com/a/m/s/p12-29521-0104887--broadwell-gt4-info.html

That seems pretty amazing. Not sure though exactly how to read this. Does shader performance = total performance? i.e 2tflops vs 832gflops = +240% . Is this a 240% performance improvement or are there other areas of the GPU for tessellation, physics etc that could effect frame rate?

This is old news.
 
Iris Pro has a hard time catching up to 650M with DDR3 memory.

So it's not a memory bandwidth issue. It's an architecture and driver issue.

Well, the only 'reliable' benchmarks I have access to are the Anandtech ones. In those, the 47W Iris Pro is roughly comparable to the 640 (DDR3 RAM) on 5/8 tests concluded. It is significantly below 640 in Bioshock Infinite, Sleeping Dogs and BF3. I don't know what those game do that Iris Pro does not like. On Crysis Warhead, the Intel is actually significantly faster.

On the other head, Intel is clearly leading on OpenCL benchmarks. I think this shows that it is more efficient at utilising computational resources where execution does not massively follow the SIMD pattern. Maybe the lower latency of eDRAM is also playing a role here, e.g. I believe that the raytracing benchmark does not access the memory in a very GPU-friendly way. Then again, my understanding of GPU architecture is very limited so I really shouldn't talk much about it :)

That isn't really true. A lot of these benchmarks did increase the detail settings along with the resolution and you cannot really conclude if it is the settings or the resolution that troubles Iris Pro.

Yes, I agree, the data is sparse. I did have the impression however, that the performance gap between 650M and Iris Pro on higher settings was more or less proportional to the gap between 650M and the 640 (slower RAM). One would need to have the benchmark data in table format, preferably for multiple runs to do some quick comparisons.

I do agree with two of you that its quite tricky and my post was too simplistic and naive.
 
More interested in maxwell tbh

I'm hoping either for either Broadwell or Maxwell to knock it out of the park :)

Either will be fine :)

The variability of the information or the interpretation of the information puts both systems somewhere between a really good boost in mobile performance and epic Playstation 4 like performance.

If I wish very hard I would like the latter ;)
 
Yes, I agree, the data is sparse. I did have the impression however, that the performance gap between 650M and Iris Pro on higher settings was more or less proportional to the gap between 650M and the 640 (slower RAM). One would need to have the benchmark data in table format, preferably for multiple runs to do some quick comparisons.

I do agree with two of you that its quite tricky and my post was too simplistic and naive.
I think the Anandtech benches have another problem. They ran on a sample of hardware before it was even sold. Old drivers and no bug fixes on any games. They offer a vague picture.
I think it is odd that no big site since have done an in depth test now on hardware you can now actually buy. There was one notebook on notebookcheck and that is it (I am not a fan of these clevo barebones).
The only real sort of reliable numbers come from people in forums who tested their own hardware. No site that just gets paid for writing articles and summing up such info.
Unless you have two MBPs you cannot even compare both GPUs in Windows yourself.
 
I think the Anandtech benches have another problem. They ran on a sample of hardware before it was even sold. Old drivers and no bug fixes on any games. They offer a vague picture.
I think it is odd that no big site since have done an in depth test now on hardware you can now actually buy. There was one notebook on notebookcheck and that is it (I am not a fan of these clevo barebones).
The only real sort of reliable numbers come from people in forums who tested their own hardware. No site that just gets paid for writing articles and summing up such info.
Unless you have two MBPs you cannot even compare both GPUs in Windows yourself.

Yes, this is very much true! Its so difficult to find concise, correctly performed benchmarks. Most of the things I find are methodologically flawed or simply extremely lazily executed.
 
This is a very complex topic... the iGPU used in Haswell already are on par or even more powerful than the current mid-range GPUs like the 750M - but paired with slow DDR3 which they need to share with the CPU, the real-world performance quickly falls down. You can see it very well on the benchmarks - the iGPU does a very good job at lower resolutions and then quickly degrades once the resolution increase (higher resolutions - more pixel data to copy). The L4 eDRAM cache helps, but it does not solve the problem.

The L4 eDRAM cache was just introduced and is now only 128MB. Within a few years, it will be 1GB or more.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.