Hypothesis Testing
Hypothesis is claim or assumption made for a problem statement. Hypothesis Testing evaluates two mutually exclusive statements(two Hypothesis H0 and H1) on Population using a sample of data.
Eg: if x changes then y will change. If an experiment conducted by you gives a certain result, you may need to conduct same experiment on different samples to prove the claim. If your experiment doesn't give the same result then it shows the first result was by chance and is not useful .
There are 2 types of Hypothesis:
- Null Hypothesis: Previous value and observed values from the claim are same.
Null Hypothesis treats everything same or equal. Null hypothesis states that there is no relationship between the two variables being studied (one variable does not affect the other). It states the results are due to chance and are not significant in terms of supporting the idea being investigated. Thus, the null hypothesis assumes that whatever you are trying to prove did not happen.
2. Alternative Hypothesis: Mathematically opposite of Null hypothesis.
Hypothesis Type is determined based on Test (Statistics):
- Chi-squared Test : A chi-square test requires categorical variables, usually only two, but each may have any number of levels.
- t-student Test : A t-test requires two variables; one must be categorical and have exactly two levels, and the other must be quantitative and be estimable by a mean. For example, the two groups could be Republicans and Democrats, and the quantitative variable could be age.
- Fisher’s Z Test: In a z-test, the sample is assumed to be normally distributed.
- ANOVA Test: It is also called an analysis of variance and is used to compare multiple (three or more) samples with a single test. It is used when the categorical feature has more than two categories.
After the test results, you can use Level of Significance to determine if Null hypothesis would be accepted or rejected.
Level of Significance (alpha): Shows amount of data that is significant. It can be either 5% or 1%. Shows if there ≤5% gap between observed and previous value then the null hypothesis is accepted.
Level of Confidence ( c ): Shows confidence in data. It will be 95% if alpha is 5%
alpha + c = 1
p-value is probability of Null Hypothesis to be true. If its <0.05 then we reject Null Hypothesis.
Below example explains the steps:
Blood glucose levels for obese patients have a mean of 100 with a standard deviation of 15. A researcher thinks that a diet high in raw cornstarch will have a positive or negative effect on blood glucose levels. A sample of 30 patients who have tried the raw cornstarch diet have a mean glucose level of 140. Test the hypothesis that the raw cornstarch had an effect.
Step 1: State the null hypothesis: H0:μ=100
Step 2: State the alternate hypothesis: H1:≠100
Step 3: State your alpha level. We’ll use 0.05 for this example. As this is a two-tailed test, split the alpha into two.
0.05/2=0.025
Step 4: Find the z-score associated with your alpha level. You’re looking for the area in one tail only. A z-score for 0.75(1–0.025=0.975) is 1.96. As this is a two-tailed test, you would also be considering the left tail (z = 1.96)
Step 5: Find the test statistic using this formula:
z = (140–100) / (15/√30) = 14.60.
Step 6: If Step 5 is less than -1.96 or greater than 1.96 (Step 3), reject the null hypothesis. In this case, it is greater, so you can reject the null.
References:
https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/