But I'm not doing that. I'm looking at all of the results and condemning their overall test procedures. And I do believe a lot of the issue should be placed with CR. Unless you believe reported data points of 19.5, 18.5, 16, and 12 hours are reasonable results within the realm of reality for their battery life tests.
I respectfully disagree. Based on your posts, you're fixated on the high end of the results. You've made no mention of the low results, nor the variability in the results. It's like you've drawn a conclusion and are laser focused only on the data that supports it. I've looked at every post you've posted in this thread. You question the intellectual curiosity of CR's review staff, but you're exhibiting a lack of it yourself. Not once in any post have you even mentioned the possibility that the MBP might have an issue. If you can't acknowledge that you're not looking for the truth, you're looking for validation. That's not the same thing.
If that aspect of the overall test was flawed, why should we have any confidence in their overall results?
This is a hypothetical question. It's valid. It's equally valid converse is irrelevant as well.
If that aspect of the overall test wasn't flawed, why shouldn't we have any confidence in their overall results? To date, there's no evidence, pro or con, regarding the question of the validity of their results. To pose a hypothetical as evidence if it has relevancy... you're better than that.
As a design engineer (one who has written and conducted a lot of acceptance test procedures for products in the past), I wouldn't. Before releasing any results I would investigate to understand what happened to insure that test procedures and measuring equipment were sound, and that procedures were properly followed. And not be satisfied until that is understood. Once that is understood, then start again from scratch with better procedures. The abnormally high numbers casts a ton of suspicion that the overall testing was flawed. Those are numbers you can't simply choose to ignore.
This is full of assumptions citysnaps. Chief among them, is the assumption they didn't check their equipment. I'm not going to assume they did, but I'm not going to assume they didn't either. There's no evidence to support it. They did check their results since they ran the tests multiple times, before and after updating the OS. I again reiterate your fixation on the high portion of the data set. You even bolded it this time.

It still ignores the larger question of the high variability of the results.
The abnormally high numbers can also cast a ton of suspicion that the MBP could be flawed.
There was no curiosity on their part to understand or determine what happened; was it an issue with their suite of tests, did they accidentally let the laptop sleep for awhile unnoticed, did their screen luminance measuring device report faulty values in some cases, did test personnel set the display brightness to the correct value for every test or were they lax, along with many other possibilities. If there were any rigor or intellectual curiosity to investigate, any flaws leading to one of those (and more) issues should have scrapped the whole test.
With respect, you don't know this to be true. It fits with the conclusion you've already drawn so it seems you've convinced yourself it actually happened. I agree that intellectual curiosity is important, but no more important than intellectual honesty. A lot of assumption and hypothetical scenarios doesn't make your conclusion intellectually honest. Also offering up their data for review goes a long way towards expressing that intellectual curiosity you claim they didn't exhibit. Their willingness to retest does as well. CR could have easily said screw it, it is what is.
Btw, thanks for the discourse. When I joined MR, this is what I imagined it would be.

I wish we got more of this. Instead of, well you know...