The new H.265 codec is just not established enough to justify the hardware spec update. The box would need far more CPU power and RAM to hold and decode reference frames.
Of course it's established. There is also HW decoders that are power efficient enough for even Apple to use, as evidenced by the iPhone 6 series release.
Also, consider that if Apple would release a 4k capable ATV, people would assume that iTunes content would be available for it.
Not having the content available from content owners even though they've been filming in UHD for many years now is what would keep this from happening, but that doesn't mean the HW won't be able to support 2160p@60Hz, even if Apple doesn't have it enabled until there is content via iTunes
as per Apple's desire to have symmetry.
You seem to think that HEVC/H.265 and 4K UHD have to come at the same time, but they don't. The ideal solution is for universal HW H.265 support to come before 4K UHD content so when they can flip the switch on their backend several years of devices will support the codec.
This may or may not mean Apple will re-encode their other content use H.265 over H.264 (perhaps even adding in more audio channels, language support, etc.).
Personally, I think a company like Apple that likes simplicity will use H.265 and UHD content as a demarcation point to make it easier for customers to differentiate, even though a more complex backend with double the content (for 480p, 720p, and 1080p) would allow for those devices that support H.265 to have file sizes that download/queue/stream faster, and use less space on their devices. Maybe 1080p could have this option because there is no benefit to watching 2160p on your iPhone 6.
Also, even though the iPhone 6 series supports H.265 codec and there there are tests showing 4K playback, albeit with the less optimized H.264 codec, I would wager that when Apple flips the backend switch for 2160p content the iPhone 6 series isn't likely to be supported because it's too close to the edge.