How do I know if a Binomial model is appropriate? I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with x = 0,1,2,3,4,5 and freq, 2,7,11,12,6,2.

Baardegem3Gw

Baardegem3Gw

Answered question

2022-11-24

How do I know if a Binomial model is appropriate?
I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with x = 0 , 1 , 2 , 3 , 4 , 5 and freq, 2,7,11,12,6,2.
I have worked out the unbiased population mean and estimate - but then I'm not sure whether binomial what I need or not? I have to decide if a bionomial model is appropriate.
I can see that the data is discrete but its not binary like "event happens" or "event does not happen". It seems relatively symmetrical - and almost normally distributed? I'm not really sure how to work this out? Is a binomial model right or not?

Answer & Explanation

Henry Arellano

Henry Arellano

Beginner2022-11-25Added 12 answers

Step 1
If this is your first chi-squared test, the clues in the comments may be a bit too sparse. Without working the problem for you, I offer the following more complete outline: (Use it along with whatever examples your text or class notes may have to offer.)
It is appropriate to try a binomial model, and obviously n = 5. From the given data you can find the sample mean of the 40 observations.
By looking at the PDF of Binom(5,0.495). you can find the expected counts E i . (multiply the probabilities by 40.) Your observed counts are F = ( 2 , 7 , 11 , 12 , 6 , 2 ) .
Step 2
Next, you can find the chi-squared statistic Q = i = 0 5 ( F i E i ) 2 E i , which is approximately distributed as C h i s q ( ν = 4 ) . [Ordinarily, a chi-squared test with 6 categories would have ν = 6 1 = 5 , but you have used the data to estimate parameter p, so you 'lose' a degree of freedom for that and ν = 4. ]]
I got Q = 1.1815. The critical value for a chi-squared test with ν = 4 at the 5% level is the 95th percentile c = 9.487 of C h i s q ( ν = 4 ) . You can find this number in printed tables of the chi-squared distribution or using software (as with R below).
qchisq(.95, 4)
9.487729
This means that you would reject the null hypothesis that the data are consistent with B i n o m ( n = 5 , p = 0.495 ) only if Q c = 9.487.
There is one remaining difficulty. The chi-squared test is usually deemed to be accurate only if all expected counts exceed 5. Your first and last about:blanks are too small. One cure for this is to combine 'categories' 0 and 1, and 'categories' 4 and 5. In each tail, combine categories by adding the two observed frequencies and adding the two expected frequencies.
You will now have four categories and ν = 4 1 1 = 2 degrees of freedom. Re-compute Q and find the new c (as below). [According to my computations, you will still not reject H 0 . ]]
qchisq(.95, 2)
[1] 5.991465

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?