Why does the standard deviation change from confidence intervals to hypothesis tests? When considering two-sample data that involves a difference of proportions, both a confidence interval and a hypothesis test can be done.

Hugh Soto

Hugh Soto

Answered question

2022-09-13

Why does the standard deviation change from confidence intervals to hypothesis tests?
When considering two-sample data that involves a difference of proportions, both a confidence interval and a hypothesis test can be done.
The standard deviation used for a difference of proportions in creating a confidence interval is p 1 ( 1 p 1 ) n 1 + p 2 ( 1 p 2 ) n 2
However, the standard deviation used for confidence intervals is p ( 1 p ) n 1 + p ( 1 p ) n 2 , where p = x 1 + x 2 n 1 + n 2 , x 1 = p 1 n 1 , and x 2 = p 2 n 2
What I don't understand is why these are different. They're both the standard deviation of the same proportion, so why should they differ?

Answer & Explanation

Clarence Mills

Clarence Mills

Beginner2022-09-14Added 18 answers

Step 1
In hypothesis testing you are making an assumption (the null hypothesis) that p 1 = p 2 . If they are truly equal, then we can call the common parameter by p.
You need to use something as a value for p when it shows up in the standard deviation expression. If you fully follow the assumption that p 1 and p 2 are equal, then neither sample proportion is your best guess for a value of p. Rather the pooled proportion is better because it uses more individuals. If p 1 really does equal p 2 , using all n 1 + n 2 individuals would give a better estimate for p, aka p 1 , aka p 2 (by assumption). Since we are now using a sample statistic in place of a population parameter, we are working with a standard error rather than a standard deviation.
Step 2
On the other hand with a confidence interval for p 1 p 2 , we make no assumption that p 1 = p 2 ; if we did, we'd be done! p 1 p 2 = 0 and that's that. So we use the best guess we have for each pi separately.
I'm assuming that you understand standard deviation, variance, and how variance is additive in the first place to understand why the big messy square root arises in the first place.
moidu13x8

moidu13x8

Beginner2022-09-15Added 2 answers

Step 1
Below, I will use p i ^ to indicate sample proportions and p i to indicate true values (population parameters). I will use 95% intervals for demonstration.
The test of differences in proportions starts with the null hypothesis p 1 = p 2 = p. Under this assumption, p 1 p 2 is approximately normal with variance p ( 1 p ) n 1 + p ( 1 p ) n 2 and mean 0. When this is true, the 95% probability interval (the interval for which, if the null-hypothesis is true, the value p 1 ^ p 2 ^ will be within 95% of the time) is approximately
p 1 ^ p 2 ^ { 1.96 p ^ ( 1 p ^ ) n 1 + p ^ ( 1 p ^ ) n 2 , 1.96 p ^ ( 1 p ^ ) n 1 + p ^ ( 1 p ^ ) n 2 }
The alternate formula (with p 1 , p 2 rather than p), is a less efficient estimator of the standard deviation of the sample proportion difference since the entire data set is not used to estimate p and instead p 1 and p 2 are estimated separately.
On the other hand, the 95% confidence interval is the set of all potential true values of p 1 p 2 for which the a sample value less extreme than p 1 ^ p 2 ^ would be generated from the same sampling procedure at least 95% of the time. In constructing this interval, one could not make the assumption p 1 = p 2 = p since asking about the potential true values about p 1 p 2 while making an assumption about that value is meaningless.
Step 2
Without the assumption of equivalence, our best estimate of the standard deviation of the difference is the alternate formula and the approximate 95% confidence interval is given by:
p 1 p 2 { ( p 1 ^ p 2 ^ ) 1.96 p 1 ^ ( 1 p 1 ^ ) n 1 + p 2 ^ ( 1 p 2 ^ ) n 2 , ( p 1 ^ p 2 ^ ) + 1.96 p 1 ^ ( 1 p 1 ^ ) n 1 + p 2 ^ ( 1 p 2 ^ ) n 2 }

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?