Audio processing, especially in realtime, is very CPU intensive. Bearing in mind a single small processing plugin might take up a whole 1% of your CPU power. Except you want to use it on 120 tracks... Or others which use more, or lots of others that use less. ... so the alternative is to render the audio offline, which takes valuable time and will only happen quicker with faster CPUs. The more cores the better as decent audio software is written to take advantage of every core it can get its hands on and will spread the load as far and wide as it can (so turbo cycles aren't a great deal of use, one of the reason dual-Xeon systems still compare well to the newer and supposedly faster i7 rigs for this sort of thing).
Additionally, it takes processor cycles to run the audio engine itself, and also to support the audio interfaces - running a stereo or even 5.1 output to speakers for games and music is child's play, running 384 channels of digital audio in and out of the computer, when synchronised to HD video with sample-level accuracy (for HD, that'd be 48,000hz at 24fps / 24-48 sub frames, 96khz if you're dithering to 48 for Bluray or keeping at 96 for SACD mastering)... that is a big heavy load.
My main audio machine is a 12-core MP with 64gb of RAM... and I have two i7 PC slaves with 24gb of RAM each sat alongside it so when the MacPro is out of power I can offload requirements over to those and keep working. Mine is a small facility, but a usual cue can easily run to 80% CPU usage without really trying too hard. I have only 128 channels coming in and out, not the 384 in the example above, yet my machine sits at around 8-10% cpu usage even at idle....