No, the probability of 30 false matches with high correlation between events would be much more than p^30.
That was
exactly the reasoning used in the
Sally Clark case. Even there, nobody claimed the chance of two random deaths was
exactly 1 in 73 million (the square of the chance of one death) - they just held that figure up as an example of how overwhelmingly unlikely a coincidence it was.
obviously not:
"Each flip of a fair coin has a 50% false positive rate"
This is not about whether your example was internally consistent - it is about whether your example was applicable to the real situation. You can safely extend your example to scenarios involving fair dice, radioactive decay etc. even though the probabilities are vastly different because those all generate independent events. Apply it to
correlated events, though and you have "garbage in, garbage out". That's not academic pedantry, its the fundamental assumption behind the formula you are using for chaining possibilities.
It's misplaced because the answer to the question has nothing to do with whether the probability of 30 consecutive tails is exactly p^30, the question is if the probably of 30 consecutive tails is greater than p^1.
In your example you
specifically calculated p^30 and even compared that to Apple's claimed rate (...and I'm not claiming that Apple calculated that from p^30 - its posters here who are coming up with that calculation).
The question is whether the probability of 30 consecutive false matches is
an order of magnitude greater than p^1 - which is what you need to avoid the (general) base rate fallacy/prosecutor's fallacy. Yes, the chance of 30 false matches is going to be less than the chance of a single match - but unless you
know that each further match is independent of the first one you can't just keep multiplying by p. Your example
assumes that by making your 'false positive' event a toss of a fair coin, which is widely accepted as independent from previous tosses.
To put it another way, in your specific example, 'p' is the same whether you're talking about the chances of 30 people in the general population throwing a single tail or a
specific person throwing their 30th sequential tail. In the real world case, p is the probability of a random photo from the entire population of photos generating a
single false match - but the next 29 matches involve the probability of someone's
personal collection of photos - mostly featuring the same people, houses, objects or types of subject - containing a second, third,... false match. You can't assume that those are the same p without knowing more about how false matches arise. If the matches are triggered by some characteristic present in one of your photos, the probability of
two matching photos
in that collection could be closer to one. So the true 30-false-match probability will be less than p but might be
significantly more than the p^30 estimate you're giving.
Nobody in either case is going to jail because of a chain of false positives.
...but at that point you're beyond mathematics and falling back on your trust in human judgement. It's good to know that Apple physically
can't look at (what people are
assuming to be) thumbnails until there are 30 hash matches - but after that you're back to Apple reporting the user at their discretion. Still depends a lot on what the nature of any false matches
is (if they turn out to be 30 landscapes and vases of flowers, d'uh, but what if they're 30 small, hard-to-make out pictures of kids?) and what level of certainty the checkers are instructed to require... and once the incident is reported, the next step is a police raid to get the original images (and those things
never go south). By the time your public defender gets involved, the accused's life will already have been turned upside down.