You also have to consider that the m-series of Intel processors are very young and are quickly climbing the initial performance curves through many optimizations.
The amount of power dissipated is the voltage through the CPU times the total effective junction capacitance times the operating frequency. You can not get around the fact that the instructions processed per second is directly proportional to the amount of transistors times the operating frequency. This means that to get more processing done, you need to pump more current. With each node size decrease, you are lowering the effective junction capacitance and thus less current per instruction processed. With better binning, you are reducing the operating voltage, all of which can reduce total power per instruction, or performance per watt.
No matter what binning you choose, you pay for the grade. The same bins that are lower voltage for the m-series, are probably binned for least leakage. The i7s of the 45 watt vareity have higher binns as well, those are usually binned for clock speed at some voltage target. Higher bins for different form factors mean different things.
That said, the perceived performance is a balance of load and capability. To project future perceived performance, you would need to understand where software load is headed as well as where processing capability is headed. The latter is an easier guess, a few percent due to architecture optimizations, a few percent due to clock speed increases and maybe some bigger gains due to new instructions like AVX2. The former is much more difficult to assess. The biggest thing around the horizon is AI. That is going to necessitate a whole different kind of thinking. For one, when AIs start writing efficient code (and they are working on this) who knows the fate of the consumer developer ? This is coming far faster than we can adapt.