All that is necessary for an adverse actor is to use the noise as a mask over legal pornographic material and the user is screwed. Apple's human reviewer isn't going to do an entire CSI analysis to make sure if the people depicted in the photo (or their body parts) were underage at the time the photo was taken.
They'll ask themselves one question: could this be CSAM? If the answer is yes then the account gets blocked and a report is files.
As for his statement that "it would require the production of over 30 colliding images", that's just intellectually dishonest. They don't have to be 30 unique images, it could be 30 of the same. And even if it would require unique images, generating a colliding image is trivial both in effort and time as has been demonstrated, to then apply that colliding image to legal porn is even less of a feat.
He also said "until they implement a filter" which Apple already have in place as part of the system.