Table of Contents

**Introduction** **Chi-square and ANOVA Tests**

In this blog, discuss two different techniques such as Chi-square and ANOVA Tests. Both are hypothesis testing mainly theoretical.

The Chi-Square test is a statistical procedure used by researchers to find out differences between categorical variables in the same population. Read more about ANOVA Test (Analysis of Variance)

**P-value**

The P-value is to decide whether we should accept or reject the Null Hypothesis. If the p-value lower than the pre-determined significance value (i.e.alpha or threshold value) then we reject the null hypothesis. The alpha should always be set before an experiment to avoid bias.

For Example, consider a population data to be in normal distribution so while selecting alpha for that distribution we select approx it 0.05 (i.e. accepting 95% of our distribution). This means that if our p-value is less than 0.05 and reject the null hypothesis.

**Chi-Square**

Chi-square statistical method commonly used for testing a relationship between categorical variables. In statistics, there are two types of variables: numerical (countable) variables and non-numerical (categorical) variables. The null hypothesis of the Chi-square test is that no relationship exists on the categorical variables in the population and they are the independent variables. The chi-square test can be used to determine whether observed frequencies are significantly different from expected frequencies.

Where,

O = observed score

E = Excepted score

A low value for chi-square means there is a high correlation between your two sets of data.

The hypothesis being tested for chi-square is

**Null: **Variable A and Variable B are independent.

**Alternate: **Variable A and Variable B are not independent.

**Types of Chi-square**

There are two types of chi-square tests. But Both of use chi-square statistics and distribution for different purposes.

**chi-square goodness of fit test**

It determines if a sample data matches a population.

**chi-square test for independence**

Compares two variables in a contingency table to and check they are related or not. In a more general sense, it tests to see whether distributions of categorical variables differ from each other.

A very small chi-square test statistic means that your observed data fits your expected data. In other words, there is a relationship.

A very large chi-square test statistic** **means that the data does not fit very well. In other words, there isn’t a relationship. Read more about Beginner’s Guide to Statistics in Machine Learning and SciPy.