This thread will summarize the enormous effort and complex steps taken by Intel and manufacturers to get the power savings Haswell will achieve.
(IMPORTANT!!: Most of the power saving techniques only apply to Ultrabook, meaning U-series and Y-series chips)
Few years ago, an information about a future chip called Haswell was released, and one of the statements were that it had REVOLUTIONARY power management.
3 steps taken to save power in Haswell(ULT/Y):
1. New super low CPU and Package C-states, C8/C9/C10, only for (ULT/Y)
2. An Intel-created framework called Power Optimizer to manage interrupts between devices
3. Collaboration with numerous hardware vendors to achieve lower power, and enable low power states
#1 explained: Current CPUs, when they say "idle", really mean critical chips are idle. Modern chips like Ivy Bridge and Haswell don't only contain CPU cores. For Haswell, there's the CPU cores, the GPU subsystems, the L3 cache, the memory controller, the System Agent(Power Control Unit or PCU/I/O connections, Router), which are all connected by the Ring Bus.
In Ivy Bridge, basically only CPU and GPU can go idle. They'll consume very little power(mWatts), but rest of the chip will be on. The reason? It's because various devices in a computer and I/O have to wake up the chip once in a while. That means part of the chip has to be "ready". All Haswell chips decouple the Ring Bus and L3 cache from the core, so the core can be asleep when GPU needs the ring bus for example. In Haswell ULT, C8/C9/C10 allows it to turn off everything.
#2 will explain how #1 is done. Contrary to what most people think, the lowest power state on the 17W Ivy Bridge CPU is at 2.2W. That means at C7 power state.
Basically, you have software/firmware/OS doing "burst" on interrupts, so it can save up and do it all at once, rather than waking up the CPU very often for just one device. It's called "Interrupt Coalescing". Every interrupt by every device is done at same time if possible. Intel created a range of specifications, and hardware called Power Optimizer to achieve this. Every device is required to follow "LTR" or Latency Tolerant Report. It basically means it tells the Power Optimizer how long it can sleep until the next interrupt. Every device really means every device. Touchscreen controllers, keyboard controllers, CPU, GPU, PCH, sensors(GPS, NFC, Cameras, etc), System Memory, Hard drive, PCI Express, USB 2/3, etc.
Even the Operating System, and this is where Windows 8 comes in. Windows 7 used to periodically poll for interrupts, Windows 8 takes it away. It only polls it when a device needs it. Because time between interrupts are longer, the CPU(and rest of devices) can go into deeper power states. The reason this is so important, is because going in and out of different power C-states actually takes time. Frequent transitions may even cause it to use more power.
#3 Devices get new power states as well. Storage subsystems like SATA SSDs get Runtime D3, which is effectively "off". You have intermediate states like Slumber, which wakes up faster, and uses considerably more power, but still much less than traditional SATA sleep. Again, ALL devices get more, and lower power states. The power delivery system will get better as well with much better efficiencies in the low power region.
Smaller effect on power reduction is due to the integrated Voltage Regulator, which will make switching between states and frequencies faster, and a TDP level that goes from current 17W + 3-3.6W PCH, to 15W(and from 13W + 3W to 11.5W on Y). There's also Panel Self Refresh(PSR) which allows display to be refreshed without requiring signal from the platform. That will save display power when display isn't changing much.