It's both. If Apple manages to make a powerful AI chip, that enables Siri to do the processing locally on the iPhone/iPad/Mac, then Siri doesn't have to send everything off to a server to be voice decoded there.It's data issue (privacy policy)
It's not a hardware issue
The problem is probably not contracting such a chip, but making a chip that doesn't deplete the iPhones battery in 10 minutes. I'm a developer, so I watched the sessions from last years WWDC (not only the keynote like most people). There was more detailed information on the AI image processing, that tags the images with metadata. When you take a photo with the iPhone, it makes more than 11 billion calculations to make this tagging, so that you can search for your cat and dog images, without having to tag the images manually. 11 billion calculations takes a lot of processing power. Powerful AI demands many heavy calculations like this, and that is why Siri currently sends the sound off to a server to be decoded, and Apple also collects data (anonymised) to make Siri better. I guess such information will still be sent to Apple, but Siri won't depend on sending to the server for decoding in the future. I'm sure it's be handled locally on the device.