Inconsistency in two-sided hypothesis testing Suppose you have two sets of data with known populati

katkoota00ys

katkoota00ys

Answered question

2022-05-22

Inconsistency in two-sided hypothesis testing
Suppose you have two sets of data with known population variances and want to test the null hypothesis that two means are equal, ie. H 0 : μ 1 = μ 2 against H 1 : μ 1 > μ 2 . There's a certain way I want to think about it, which is the following:
P ( μ 1 > μ 2 ) = P ( ( μ 1 μ 2 ) < 0 ) = P ( x ¯ 1 x ¯ 2 ( μ 1 μ 2 ) σ δ x ¯ < x ¯ 1 x ¯ 2 σ δ x ¯ ) = P ( z < x ¯ 1 x ¯ 2 σ δ x ¯ )
To me, this 'derivation' makes it perfectly clear what's actually going on. You're actually calculating the probability that H1 is true and not just blindly looking up some z-score.
However, now suppose that H 1 : μ 1 μ 2 . The problem with this is that the method I just described doesn't seem to work. If I write
P ( μ 1 μ 2 ) = P ( μ 1 < μ 2 ) + P ( μ 1 > μ 2 )
Then all that happens is P ( μ 1 μ 2 ) = 1. I think I'm probably not interpreting the above equation correctly.

Answer & Explanation

Melina Glover

Melina Glover

Beginner2022-05-23Added 11 answers

In your way of thinking, you decided to set
z = x ¯ 1 x ¯ 2 ( μ 1 μ 2 ) σ δ x ¯ ,
and from this you conclude (correctly, I think) that P ( μ 1 > μ 2 ) = P ( z < x ¯ 1 x ¯ 2 σ δ x ¯ ) ..
The question is, if you cannot assign a probability distribution to μ 1 μ 2 ,, how do you compute P ( z < x ¯ 1 x ¯ 2 σ δ x ¯ ) ? And of course if you can assign a probability distribution to μ 1 μ 2 ,, then you can use that distribution to compute P ( μ 1 μ 2 > 0 ) directly.
Note that if P ( μ 1 = μ 2 ) > 0 ,, then
P ( z = x ¯ 1 x ¯ 2 σ δ x ¯ ) > 0 ,
because you have defined z in such a way that it is simply x ¯ 1 x ¯ 2 σ δ x ¯ if the two populations have the same mean.
When you conclude that P ( μ 1 μ 2 ) = P ( μ 1 < μ 2 ) + P ( μ 1 > μ 2 ) = 1 ,, you are assuming that z has zero probability to equal x ¯ 1 x ¯ 2 σ δ x ¯ exactly. This would be true if z x ¯ 1 x ¯ 2 σ δ x ¯ had a continuous distribution, but how do we know it does?
Edit: There are some quite reasonable motivations (from a practical point of view, if not a mathematical one) for attempting some kind of approach to coming up with a value for P ( μ 1 > μ 2 ) as explained in the question. The main motivation is that this is the way we seem to want to be able to think about statistics: just how much weight (i.e. likelihood) should I assign to the possibility that certain facts are true? Unfortunately it's often very difficult to make a convincing case for a particular value of such a likelihood. Instead, what frequentist statistics gives us is an apparently roundabout statement that if a certain fact were not true (that is, if that fact's "null hypothesis" were true instead), it would have been extremely unlikely for us to have made the observations we just made.
A more precise and succinct explanation is given in this answer to another question.
To test the hypothesis that μ 1 > μ 2 in the posted question, we can define the null hypothesis as μ 1 μ 2 .. Now, having obtained samples from the two populations, how likely is it that we would have gotten samples "like those" if the null hypothesis were true?
If x ¯ 1 x ¯ 2 the answer is "likely enough," so we only have an interesting statistical test in the case where x ¯ 1 > x ¯ 2 .. Assuming that's the kind of sample results we got, then among all possible ways the null hypothesis could be true, the one that gives us the best chance to obtain samples "like" the ones we did is if μ 1 = μ 2 ,. But if we assume that μ 1 = μ 2 ,, then prior to taking the samples, x ¯ 1 x ¯ 2 σ δ x ¯ was a random variable with a standard normal distribution (mean zero, variance 1).
Suppose that having taken our samples, we find that x ¯ 1 x ¯ 2 σ δ x ¯ = 2.38.. That's a relatively extreme value for a standard normal variable; 99 times out of 100 the value of a standard normal variable will be less than that. In fact, the probability is 0.99134 that a standard normal variable will have a value less than 2.38. (I know this because someone computed that probability and put it in a table, and I looked it up there.) There is therefore less than a 1% chance that we would have observed a sample mean x ¯ 1 so much larger than x ¯ 2 if the population mean μ 1 were not actually at least a little bit larger than μ 2 .. We therefore reject the null hypothesis.
Using samples drawn from populations with continuous distributions, it appears impossible to test statistically whether the means of two populations are exactly the same, because even if they were the same, with probability 1 we would still observe different sample means. (See also this answer on that topic.)
There is another kind of statistics called Bayesian statistics that (as far as I understand it) does assign probabilities to the truth of statements that you might want to prove or disprove, but only by using observations of experiments to modify probability assignments that one was able to make before the experiment.

Do you have a similar question?

Recalculate according to your conditions!

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?