And how many execution units there are in the GPU shouldn't matter. If we go by that logic, the 650M with 384 shader units should completely demolish HD 5200 with only 40 units.
Since Intels EUs are 16 wide vector units one would have to compare 384 vs 640 (40*16) to make a somewhat fair comparison.
Nvidia is better at loading its units at max efficiency but it has fewer of them and they don't even have as high a turbo. So by the stupid dumb logic the Iris is more powerful.
Adding some close by cache like that L4 is for the most part a power savings feature. It is just the cheapest way to handle bandwidth requirements with minimal power use. It is not supposed to be as big as a full 2GB VRAM, it only should enough to reduce the load on the two 64bit DDR3 channels. There is a reason smartphone GPUs worked differently because if they could keep data close they could save lots of power.
Intel seems to like that solution so much that DDR4 seems not even make it on the Broadwell roadmap.
They obviously cannot use GDDR5 like the PS4 because that would kill CPU performance as GDDR is bandwidth optimized and bad in latency. Game developers may be willing to program around that problem in games but for everyday desktop applications it wouldn't be good.
Putting that L4 in place adds a lot of bandwidth for a third of the power cost of anything else.
The thing about all this is that iGPUs simply have more options to yield more power efficiency. Today it is about similar to somewhat lower performance but each generation it will be harder for Nvidia to keep up. They will be pushed more and more into the higher TDP classes to make a case for their GPUs.
The 2010 MBP was basically a 73W notebook. Now we are at 100W. With Haswell it will go down to about 60-70W. (10W screen, Turbo) The worth of those added 40W dGPUs should deliver more than justnsomewhat faster especially for Apple with their crap automatic switching that turns on unnecessarily way too often.
If one complains about OpenGL performance, I would really look more at OSX performance. Drivers are a big part of that and Apple will probably try to make sure there isn't a big difference (to 650M) to complain about.
People in this thread are comparing the GMA 950 which is light years away in performance from even a HD 3000. GPU performance has four variables.
- How much transistors are dedicated for the GPU (huge 500 mm2 GPU is usually a lot faster than one much smaller).
- Bandwidth (as all the processing power is no use if not fed)
- Architecture
- Drivers
- How much transistors are used for the GPU has changed a lot. iGPUs used to be mini on old processes. AMDs APUs started first with putting almost half the APU size for the GPU. Intel is now at the same level today.
- Bandwidth is only an issue for the faster GPUs. Intel got the L4 Cache to break the barrier of slow DDR3.
- Architecture changed a lot. About 5-7 years ago Intel started hiring lots of GPU experts. It takes a while to get results but this new GPU architecture is made by people who used to work at Nvidia/AMD/others. That has nothing to do with that little bit of GMA like afterthought GPU.
- Drivers have also picked up and don't really matter as much on OSX anyway. Hd 4600 is doing in games significantly better than HD 4000 while the optimized synthetic are more in line with the small EU increase. They obviously have some people working on that. Obviously not like Nvidia that even helps out game developers when they are still developing but on their end it is not as bad as it used to be.
The HD 5200 will definitely not be as fast as the current Gen dedicated alternatives but the real issue is whether that difference justifies the power difference. Nobody complained about the 6750M being crap and the Iris Pro will definitely be better than that one.
I think this generation around a notebook would have to have a 765M or faster to display a big enough performance difference to really be worth it.