I would actually be interested in hearing why
@macfacts assumed this, if you don't mind?
But the facts is this… Taking a surround sound source and using it as part of "spatial audio" is, to force your comparison into it, like taking stereo sound and NOT converting it to mono. That's it; the information needed is already there, the player just needs to NOT destroy it.
There's nothing special about Apple's spacial audio as far as formats goes; instead they are doing exactly like
@SpringKid said, they are taking the current multi-channel sources and like ridiculously fast calculate where those channels would originate as compared with how your head is positioned.