Inference for Categorical Data(KA/SAP/U14InferenceFor)
Unit 14: Inference for Categorical Data (Chi-Square Tests)
Topics Covered:
- Chi-square goodness-of-fit tests
- Chi-square tests for relationships
1. Chi-square Goodness-of-Fit Tests
The chi-square goodness-of-fit test is used to determine whether a set of observed categorical data matches an expected distribution. It answers questions like: "Does a die appear to be fair?" or "Are the colors in a bag of candies distributed as claimed by the manufacturer?"
- Null hypothesis (H0): The observed frequencies match the expected frequencies.
- Alternative hypothesis (Ha): The observed frequencies do not match the expected frequencies.
- Test statistic: \( \chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} \), where \(O_i\) is observed and \(E_i\) is expected count for category \(i\).
- Degrees of freedom: Number of categories minus 1.
2. Chi-square Tests for Relationships (Independence/Association)
The chi-square test for independence (or association) is used to determine if there is a relationship between two categorical variables in a population. For example: "Is there an association between gender and voting preference?"
- Null hypothesis (H0): The variables are independent (no association).
- Alternative hypothesis (Ha): The variables are associated (not independent).
- Test statistic: \( \chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}} \), where \(O_{ij}\) is observed and \(E_{ij}\) is expected count for cell \((i,j)\).
- Degrees of freedom: (number of rows - 1) × (number of columns - 1).
Visualizations
- Bar charts for observed vs. expected counts
- Interactive contingency tables
- Dynamic calculation of chi-square statistics
Comments
Post a Comment