Well, you'll need a webcam for video conferencing - and many of those have perfectly good built-in microphones. If you've got a USB-C display then its one lead from the Studio to the display and one lead from the display's downstream USB to the webcam. There are also plenty of USB-C, DisplayPort and HDMI displays with built-in speakers that are good enough for video conferencing or hearing voices on YouTube. But, yeah, the iMac (and presumably the Studio Display) is the only thing I've seen with speakers built into a display that you'd want to listen to music etc. on.
If we're talking about a computer as powerful/expensive as the Mac Studio or the old, higher-end Intel iMacs, a lot of buyers will want them for prosumer video/audio. While the iMac speakers and mics may be really impressive for something built in to a display, that's a very, very low bar and a half-decent pair of external speakers or a desktop mic will blow them out of the water. If you listen to a lot of music or watch TV/movies, you'll often want something better. If you do any amount of video/audio production beyond casual youtubing then you'll pretty much need monitor speakers, the right microphone for the job (properly isolated from the desk) and probably an external audio interface to connect them.
I agree that the Studio Display is expensive and lacks some features you'd expect at that price (a decent stand, detachable mains lead, additional video inputs, after-market VESA mounting), but its worth remembering that a Mac Studio + Studio Display costs about the same as the old i9 iMac with 32GB of RAM, which would have been the comparable system. Personally, I welcome the ability to have a pair of matched 4K displays for less than the price of a single 5k Studio Display.