And when they said 50%, they are referring to 50% of max volume I assume. What else can it be.
You're missing my point: how loud is "max volume" on each device? By setting each one at 50%, they are likely not comparing them at equal volume. And even if each device had the same max volume, they are still assuming that each device uses the same volume scale. It's kind of the same principle as this.
Basically, it's a typical blog test that is making a back-of-an-envelope attempt to be 'scientific', but hasn't quite thought it through fully enough.