It's what they've done since iPhone 15. Basically, there are three types of zooms right now: optical (pure hardware), 'optical quality zoom' (what you're referring too) and digital zoom (upscaling).
Digital zoom is when you have a 12 MP sensor, for example, start zooming in digitally and than create an upscaled picture that's again 12 MP. It's often not very good quality, as the upscaling often isn't very good. The device literally makes up pixels to get back to that 12 MP.
Optical quality zoom is what Apple does with their 48 MP sensor. When you zoom in, they basically grab the middle 24 MP. They don't upscale it, so no new pixels are 'made up'. You truly get what the sensor is capturing so in that sense it's optical quality. That's not all, though. They in fact downscale to 12 MP. Using their ISP (Image Signal Processor) and some algorithms, they try to improve picture quality even more.
Pure optical zoom is no up- or downscaling, like you get with the 5x lens on iPhone 16 Pro.
Regardless of the type of zoom, make no mistake that images always are processed by the ISP and its algorithms. It's also absolutely necessary. Let's say you take a picture outside, the ISP is what makes it possible that both the sky shows a beautiful blue color in HDR and that you can make out the details and faces of people sitting in the shadow of the tree in that same picture. Every manufacturer has integrated an ISP and in the past decade it has arguably contributed more to image quality than upgraded lenses and sensors (although it's obviously an interplay between both).