Those benchmarks are impressive (a little surprising too), but they're also failing to tell the whole story. Seeing the Surface Pro 3 up above the SP2 already demonstrates the problem: Those short benchmarks do not take thermal throttling into account. In prolonged (more than ~5 minutes) tests the SP3 becomes significantly slower than an SP2.
This 4.5W (6W as tested) TDP chip is going to have to throttle even more quickly and aggressively because of its cooling situation. In the real world it is going to be a worse sustained performer.
Of course now that we've seen what the CPU is capable of in burst performance (>4300u), maybe if you're Apple that would be an acceptable trade considering the average usage of the MBA line? I guess the normal MBA user isn't doing anything that involves sustained heavy CPU use so maybe I'll need to rethink my opinion.