For the same reasons pro equipment has always been of higher quality than typical consumer gear, regardless of the field.
1) Minimize cumulative degradation during the production process. A little dirt here, a little dirt there, and soon you have a real mess. (The cumulative effect of multiple out-of-tolerance conditions has to remain within the desired final quality). Per Wikipedia
http://en.wikipedia.org/wiki/Audio_bit_depth: The fact that such additional precision is needed does not necessarily mean that the difference can be discerned by humans in an A/B comparison of two clean tracks.
Not that it's an issue in this discussion, but it also means studio environments that are sufficiently isolated from noise sources (traffic rumble transmitted through the building structure, velocity noise from the HVAC ducts, humming fluorescent light ballasts, chairs that don't squeak beneath the butts of the string section....). There's no living room as noise-free as a well-built studio, but if we're just laying down a handful of acoustic tracks, close micing in the living room has been good enough for many commercial releases.
2) Marketing. You can't charge pro rates unless you have gear to match. An iPhone 6 on a $30 tripod may be good enough to shoot the job, but don't let the client see it.
I've been witness to 40 years of "How good is good enough" debate over digital audio, video, and still photography. In the beginning it was over the difference between 44.1 KHz and 48 KHz at 16 bits, linear. And, of course, how much "better" black vinyl sounded - I'd cringe over how bad they sounded compared to my digital master, while the "golden ear" beside me waxed rhapsodic over the "warmth" of the disk (let's see... additional compression to prevent cutter head over-excursion, added bass due to rumble, intermod, and phono cartridge hum pickup... yeah, definitely a fatter bottom end). And don't get me started on data destruction (lossy "compression")...
Owning high end equipment does not magically make the owner capable of discerning the difference. I'd wager that, for at least 90% of them, it's "The Emperor's New Clothes." Musicians, engineers and producers who can most certainly detect off-pitch, off-beat performances and instantly know the difference between a fresh set of bronze-wound strings and a set strung a few days previously, have a level of ear training that few others will ever attain. And even they (thanks to inevitable hearing loss, I'm no longer part of the "we") can be fooled into hearing a difference simply because they know the makes/models of the equipment/instruments, or the identity of the performers. There are countless double-blind tests that prove it.
The brain can be trained to discern the slightest variations captured by our eyes and ears, but once the signal falls outside the capabilities of those organs, the brain starts filling in the gaps. And the brain can't always be trusted.
Inevitably, the cost of technology drops, and today's impractical (say, 24-bit/192KHz linear audio streamed over a cellular network and/or maintained in flash storage on a mobile device) will become practical. But will that necessarily be the best use for the bandwidth? John Woram, (speaking at an AES convention back around 1975) cited "Foobini's Law" (not to be confused with Fubini's Law), "Not everything that can be done should be done."