Nowhere do they say its hamming distance, which would be a pretty poor implementation of this -- they're likely doing fuzzy hashing on the embedded vectors.Sort of. Technically, it’s the Hamming Distance. The distinction between nearly identical and merely similar is most likely a few bits at best. Totally configurable by Apple and not likely very accurate based on:
1) Mobile device processing/battery limitations. Apple has a good neural engine, but it’s still edge processing.
2) The fact they need to open it up to 30 matches to meet their false positive requirements. I ran some quick probability estimates, but there’s still too many assumptions and statistical estimation of perceptual hashes are not my area of expertise, so I’m likely am missing something. I’d like to see the real math direct from Apple, but the performance does not appear confidence inspiring.
3) Because of the backlash and multiple AI experts calling, most likely, BS on the 1 in a trillion claim, Apple added a middle step between on device hash matching and human review. There’s now a larger, higher performing, independent (at least so they claim) perceptual hash algorithm that verifies the edge hashing and runs in the cloud (I.e. big, traditional servers). Only if that also matches, will a human review it. If your edge processing is good enough, you don’t need that; it should be obvious Apples is not.
I posted in another thread here with some dissection of the language they're using in their documents, along with some probability work to show the 1 in a trillion claim seems more than reasonable.