Explain what chi-squared test of independence is. What

Janiyah Hays

Janiyah Hays

Answered question

2022-04-05

Explain what chi-squared test of independence is. What is the underlying assumption? What is the test statistic and what is its asymptotic distribution? How the hypothesis is formulated and what are the expected outcomes? What is the criterion for rejecting the null hypothesis? How is the p value evaluated? Provide a hypothetical formulation of test of independence.

Answer & Explanation

aznluck4u72x4

aznluck4u72x4

Beginner2022-04-06Added 16 answers

Explanation:
Suppose data is collected on two categorical variables from the same set of individuals. Each categorical variable has at least two levels, so that there are several combinations of the levels of the two categories.
The frequency of occurrence of each combination of the two categories is noted for the subjects considered in the study.
The chi-squared test of independence helps to determine using a structured hypothesis testing method, to determine whether the two variables have a significant relationship, or are independent of each other.
Assumptions:
The primary assumptions and conditions that must be satisfied in order to conduct a chi-square test of independence are as follows:
1. The sample must be collected using random methods, to ensure that the observations are all independent of one another and there is no unnecessary bias.
2. The variables of interest, the association between which is to be tested, must be categorical, each with at least two categories.
3. The expected count for each cell representing a combination of the levels of the categorical variables of interest must be at least 5.
Test statistic and asymptotic distribution:
Suppose one of the categorical variables has r levels, and the other one has c levels. Then, there are a total of rc combinations of the levels of the variable.
Suppose the categories of the first variable are recorded along the rows, so that there will be r rows of observations, and the categories of the second variable are recorded along the columns, so that there will be c columns of observations. As a result, the data or frequencies will be stored in a (r x c) contingency table.
Consider the observation in the ith row and jth column of the table, that is, in the cell (i, j) for i = 1, 2,..., r, and j = 1, 2,..., c.
Denote the observed frequency in the cell (i, j) by Oij.
The expected frequency of the cell (i, j), that is, the cell containing the frequency of the combination- ith level of the first variable and jth level of the second variable, if the two categories are independent, is, Eij = [(total of row i) ∙ (total of column j)] / (grand total).
Now, the formula for the chi square test statistic for the test of independence is:
χ2={i=1}r{j=1}c(OijEij)2EijWhere,Oij is the observed frequency in cell (i, j);
Eij is the expected frequency in cell (i, j).
The degrees of freedom for the test statistic would be, df = (r – 1) * (c – 1).
If the null hypothesis is true, that is, if the categories are truly independent, then the asymptotic distribution of the test statistic for the chi-squared test of independence would be a χ2 distribution with (r – 1) * (c – 1) degrees of freedom.

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?