# Is there something odd about the way statistics are used?

From the little that I know about statistics there seems to be something not quite right in the way they are used. For example, suppose that a poll is taken to find out whether people are in favor of a certain issue. A sample is taken and it is found that 30% of the sample backed the issue.

The statistician will now give a confidence interval for the results. Let’s say that in the present case that the 95% confidence interval is plus or minus 3%. You have to understand that this does not mean that there is a 95% chance that the actual percentage is between 27% and 33%. What it means is that if the actual percentage is between 27% and 33% then there would be at least a 95% chance of getting our particular results. This seems to be doing things backward.

By way of analogy, suppose a farmer is trying to figure out why his pig got out of its pen. He might reason that if the pig was able to fly there would be a 100% chance of it escaping, so he adopts that as the explanation.

I am not suggesting that we stop using statistics, just saying that there is something about the whole idea behind the way it is used that I find a little unsettling.

Observing members: 0 Composing members: 0

My issue with math in general is not the numbers or the logic behind them, but the terminology. For that reason, I feel that degrees in English and Law are required to advance beyond Algebra I. Especially Law, as the main thing statistics are used for is to argue a viewpoint.

jerv (31027)

@LostInParadise Sounds like you are a Bayesian! What you want is the “credible interval” in Bayesian statistics, which is the probability that a mean lies within the interval, not the convoluted probability you described from classical statistics. We seem to be in the middle of a shift from classical statistics to Bayesian statistics… not sure how long it will take for that to show in political polling. It will “likely” be a long time.

glacial (12115)

statistics are often a way to hide lies behind math. Statistics are just numbers and we use math to work with them but they still need to be interpreted and collected properly. Collection practices and the exact method for their calculation have to be published with them otherwise there is no mathematical legitimacy.

@glacial , I am going to have to learn more about Bayesian statistics. I know what Bayes’ Law is and I understand that Bayesian statistics requires making assumptions on what they call the priors.

@LostInParadise Yes, it’s a completely different (and some would say far more sensible) way of looking at data and hypothesis-making. Part of me wonders whether the kind of irrational fear @ARE_you_kidding_me expresses above might keep it from ever being used for political polls, though. Much of what works so well in Bayes is decried as bias by classicists.

glacial (12115)

I also think that you would agree more with Bayesian statistics.

The following is based on my memory of last school year:

“Frequentist” statistics and Bayesian statistics operate in different ways. Let’s say that you have a hypothesis H0 and you want to see if it is reasonable.

Frequentists will do this by attempting to reject the hypothesis. They use a test that rarely rejects H0 by accident. If they succeed in rejecting H0, then they can reasonably say that H0 is false, as the rejection was almost surely not accidental. If they fail to reject H0, then it means that the value of H0 was close enough to the true value as to be indistinguishable using the test. Note that H0 is never proven true, it is either “not close enough” or “not known to be not close enough.” It is up to the statistician to determine how close is close enough.

Bayesian statistics uses a different approach:
Prior Beliefs + Data -> New Beliefs
Basically, your degree of belief in H0 should reflect your degree of belief before data was collected. If you were already incredibly confident that H0 is true, then it will take a lot of data to convince you otherwise. If you were on the fence, then not as much data is needed.

Personally, I think Bayesian statistics makes more sense, but it has some setbacks. First of all, you have to decide what your prior beliefs are. You have to assign it a number. That is hard to do, and there are different ways of doing it. If different people have different prior beliefs, then they can receive identical data yet come to contradictory conclusions. Frequentist statistics are objective, Bayesian is not.

PhiNotPi (12643)

Statistics work any way the user wants them to. They can accurately predict or explain something, or be used to sway people.

Which one of these is true? 2 + 2 = 4 or 3 + 1 = 4 or 5 – 1 = 4

They all are, yet each says something completely different.

Polls can be scewed by the question as well. Example: “Do you approve of the offer” is completely different from “What is your approval level of this offer”.

I think that much of @YARNLADY‘s complaint is oriented towards the polls/surveys, not the underlying math. “Nine out of ten doctors” is a statistic (number), but it is not statistics (a field of math).

Statistics is a method of using the gathered data to come to a reasonable conclusion. For example, imagine that you survey 10 doctors and 9 of them say “yes.” This is just the starting data, not a result. Using statistics, we can ask “what is the true percentage of all doctors that say yes?”

The answer, in this case, is “somewhere between 58.7% and 97.7%, with 95% confidence.” That’s statistics.

Polls/surveys are a category of their own, and getting accurate and useful survey data is very hard. The general idea is that the result is only as reliable as the initial data.

PhiNotPi (12643)

Whenever coming face to face with statistics, it is always wise to remember what Mark Twain said about them.

rojo (21866)

@PhiNotPi Yes. Exactly. This question has nothing to do with politics or the way the numbers are used. It’s about the mathematics of statistics.

glacial (12115)

I’m revisiting this question about 7 months later, and I’ve learned some stuff since my answers above.

I would like to take some time to expound on one of my statements above: “Frequentist statistics are objective, Bayesian is not.”

Now, I actually think that both methods have their own ways of injecting “subjectivity” into the result.

With Bayesian statistics, you must choose a prior distribution, and the choice of prior can be controversial. People who choose different priors will have different results.

With Frequentist statistics, you can calculate the p-value pretty much objectively. You must decide, however, where to place the cutoff point between “reject” and “do not reject.” The choice of alpha = 0.05 is actually quite arbitrary.

PhiNotPi (12643)

or