Foveated rendering means rendering the image at a higher resolution where the eye is focused, and is not about the physical or visual positions of the hardware pixels. 4000 pixels across isn’t enough to cover the whole directly-visible FOV with Retina resolution.The Vision Pro uses foveated rendering, meaning that the pixel density is not uniform, and instead is higher in the central foveal area than in the periphery, matching the eye’s visual acuity. In this paper, 24 PPD were achieved at the center using a 1440 pixel display with a pancake lens. This would translate to 67 PPD for a 4000 pixel panel, matching Apple’s definition of a Retina display (57 PPD and up) as well as the conventional definition of 20/20 vision (60 PPD).
The only other factor I could think of that would make 4000 VR pixels better than 4000 monitor pixels is that your head will always be subtly moving so there could be a temporal element where your brain is fusing together multiple frames together to infer a higher resolution.