On a technical level, I’m not so sure how plausible that is. Remember, the Mac apps aren’t running on Vision Pro and have no conception whatsoever about what Vision Pro is. To the app’s perspective (and to SystemUIServer’s perspective or LoginWindow’s perspective), it’s just outputting to a display. It has no idea (or maybe only minimal idea) that the display is Vision Pro, it might as well be an HDMI connection. In order to, say, display Finder windows in Vision Pro, WindowManager (macOS’s answer to, say, twm or sddm used by KDE 5 on Linux) would have to be able to pass its individual windows as objects to Vision Pro that would then display on Vision Pro’s display. I’m not sure it could do that, it seems to have largely been designed for displaying windows on a locally connected output device. Apple has done somewhat similar things before; the closest analogy in the Apple world is either the TouchBar or first gen Apple Watch apps, where the UI executes on a separate device from the device running the backend code*. But that was strictly an opt-in thing for macOS (or iOS) applications. It’s likely that macOS support for Vision Pro would work something along those lines, and I’m not sure how many macOS exclusive developers are out there these days (if they’ve got the same app on, say, the iPad, you might as well just make a native Vision Pro app).Windows/apps don't need to be constrained to a boxed display/window at all, especially with spatial computing. Let Mac apps and windows roam free like all other VisionPro apps. Either that or just allow resizable aspect ratios.
X11 was designed to be primarily a networked window server, its windows were meant to be drawn on an updated glass terminal with an X11 client running X11 events passed over the network from the server. In other words, X11 GUIs were intended to be run on separate machines from the servers hosting the applications (which is how the X11 clients on Android work [or Mocha X11 on iOS, for that matter] to this day, incidentally). WindowManager was probably designed to work on the local machine without any sort of remote output capability.
* On Macs equipped with a TouchBar, the TouchBar itself is basically an embedded Apple Watch, with a separate either S or T series chip (I can’t remember the exact details) as processor, while the application providing the functionality is either an Intel x86-64 processor or an Apple Silicon ARM processor.