I suspect you're missing one of the big concerns: It's not
this specific implementation that's necessarily the problem, although there has been some evidence to suggest it is, or could become, problematical. It's also the precedent of doing any on-device scanning for the purpose of ferreting-out illegal activity that's a problem.
If doing on-device scanning for CSAM is ok, then why not on-device scanning for prohibited <name your thing>? Weapons? Political gatherings? If scanning images that people plan to upload to cloud storage is ok, then why not scan images regardless of whether they're to be uploaded anywhere? If scanning for image matches is ok, then why not scanning for "hate speech?" (Some countries do have "hate speech" laws and there are people, right here in "the land of the free," that would like to see them here, too.) Or planned protests? Or...?
Yes, this is a slippery slope argument. But that doesn't
necessarily make it fallacious, as some here are wont to claim.
Bottom line: It is felt by many, and by
all security researchers and privacy advocates, that this crosses a line that should not be crossed. I agree. Emphatically.
Besides: Viscerally, having some kind of scanner not under my direct control, on my devices, is... icky