Discussion in 'Politics, Religion, Social Issues' started by Herdfan, Oct 29, 2016.

There has been a lot in the news lately about polling and how the samples are adjusted based on what makeup the electorate will actually be.

I have a degree in Accounting and in our Auditing Class and studying for the CPA exam it was drilled into our heads about random samples. So lets say you are going to audit a big company that has millions of transactions. You can't look at them all, so you take random samples.

And random does not mean you go into a file cabinet and just grab a bunch of transactions to examine. You use a random number generator to select them. This provides a truly random sample.

So why not do this for polls? Have a number generator provide call lists and who ever answers the phone is part of the sample no matter their party, race or whatever. You could limit it to likely voters, but no other qualifying conditions. Sample 300 people every day and use a 7-day rolling average. I think this would provide as good a result as what they are doing now.

Which is something that nobody knows until after the election.......until the election it's all guesswork as to what the makeup will actually be, educated guesswork, but still guesswork.

--- Post Merged, Oct 29, 2016 ---
Incidentally, what do you mean by random number? Drawn from which distribution?

This is essentially what's done to conduct the poll. However, we do know basic demographics of the overall population and of those who are likely voters. If the random of say 1,000 people polled oversamples or undersamples a demographic component the poll will be skewed, so the results from the polled sample of those demographic groups are adjusted to reflect actual demographics.

Oversamples or undersamples can frequently occur in smaller segments of the polled population. Consider in abstract: Flip a coin 100 times and the result should be close to 50/50 almost every time. However, flip a coin 10 times and you'll get 6/4, 7/3, 4/6 and 3/7 a non-trivial number of times.

Now, look at this in a polling context. 1,000 people polled should get the M/F ratio right almost every time. OTOH, consider smaller, but significant segments. Suppose Blacks are 11% of the voting population but the polled sample is only 7% Black - a 35% variance. If you don't weight the results properly the poll will be inaccurate.

They don't teach that concept until one shells out for the Masters degree. Not for the cheapskate undergraduates (or anything costing less than the first \$100k).

This is where the strategy fails. You start off with a random sample (those selected by the random number generator) and then take a non-random subset of the sample (those who answer the phone and choose to answer your questions). The transactions in your other example don't have free will and don't suffer from this problem.

In polling of human beings for their opinions, it is known that the set of responders never represents a truly random sample, and various techniques are applied to repair those defects in the data. It's a difficult problem and causes big variations between polls.

Good questions. Thinking about it more, it would take much bigger sample sizes than 300 per day. Maybe 5,000 per day on a 7-day rolling average. Good point about the freewill of the people.

The polls in 2012 were wrong because they used the 2010 voting demographics thinking Obama would not generate quite the same affect on turnout that he did in 2008. Makes sense as he was already the first black President so the excitement could have easily been down. It wasn't, but the pollsters didn't see it.

So pollsters using 2012 demographics have Hillary winning easily. But using the 2014 voter turnout model, Trump is either tied or leading. I think even those who plan to vote for her would say she doesn't generate the excitement that Obama did among her supporters. But Trump does. Is this accounted for in the polls?

It could be that model isn't valid this cycle as well, then the real surprise comes.