No test has, to my knowledge, been carried out to establish how large a difference needs to be to be consistently “discernable”. I read on diyaudio about someone trying to blind test speaker drivers, and they had to cancel because all they learned was that they couldn’t consistently hear ANY difference between ANY of the drivers… and I don’t think anyone will argue that speaker drivers don’t make a difference… All they learned was that blind test is hard. So, I still claim that blind test is one way. If you can tell a difference in a scientificcaly valid (!) blind test, then I 100% agree that you have now proven there is a difference. If not, then you have ONLY proven that under the given test conditions you were not able to prove a difference. That does NOT mean that you have proven that there is no discernable difference, since you don’t know what the result of different test conditions would be.The ABX method is explicitly designed to test whether there are discernible differences between two inputs and it's considered valid if performed properly.
The main issue is that often it's just not performed properly, not in the method itself.
My favourite analogy: Look at a bowl with 100 M&M’s. Now look at a bowl with 99 M&M’s. Can you tell which is which? Probably not. Now eat 100 M&M’s, and compare with eating 99 M&M’s. Can you tell the difference? Probably not. Now find 100 5-year olds and tell them they can each have one M&M. You should now have a very noticable auditory response letting you know if you had 100 or only 99… you changed the test conditions, so now the difference matters.
I am NOT saying that there exists a setup where you can blind test and spot the difference scientifically, I’m claiming that such a setup doesn’t exist, BUT that this does not render the difference irrelevant. Because if you add up 10 or even 5 or 2 “imperceivable” difference, you may very well add up to a perceivable one. But how are you going to test that?