Analysis of Variance test
- Assumptions of ANOVA F –test are as follows
- Samples are selected in a random manner from treatment populations.
- All the treatment populations follow Gaussian (Normal) distribution.
- All the treatment populations have same variances and are homogeneous in nature.
- There is no correlation between the mean and variances of the samples.
- The primary effects are additive in nature, that is block and treatment effects are additive.
- The residuals follow Gaussian distribution and are independent in nature.
- Null and Alternate Hypotheses for ANOVA are as follows
Example: The effect of soya protein level of three separate foods is being found.
Data for the test: Quantity of soya protein levels for 1122 samples. The response factor is Soya Protein Level (%), where the factor is FOOD type and measurement levels are 3 (vitamin A, B, and D).
Null hypothesis: H0: There is no significant effect of soya protein level due to different food products ().
Alternate hypothesis: HA: There is a significant effect of soya protein level due to different food products.
The Fisher-F value is found as described in the following table (Meyers, Gamst &Guarino, 2016).
Figure 1: ANOVA Table
If the F value falls in the critical region corresponding to p-value as less than 0.05, then the null hypothesis gets rejected. Otherwise, null hypothesis fails to get rejected.
- The validity of the assumptions is then cross-checked.
- Independence of the sample gets checked from the measures of ANOVA. If repeated measures are present then samples are not potentially independent.
- For checking normality of the sample data, Shapiro-Wilk test is performed along with Box-plot construction.
- Variances of the samples are cross-checked for equality and assumption of variance homogeneity is established.
The non-parametric Sign test associated with the sign of the data which the test assigns from the hypothesized mean value. The comparison between the clusters dimension is done in Sign test. The Sign test is generally used for non-normal distribution data.
- Assumptions of the Sign test are as follows,
- The compared groups should be from two different samples
- The number of observations for both the samples should be equal.
- The two samples should have paired data to be compared.
- The differences should be independent of each other.
Example: Comparison of scores in Psychology for students of the university is compared for 32 students for spring and fall seasons. Is there any improvement in scores?
The median of the scores is found for both the data sets.
- The null and alternate hypotheses are as below,
Null hypothesis: H0: There is no difference in median scores for spring and fall seasons.
Alternate hypothesis: HA: The difference in median scores for the two seasons is not equal to zero.
The numbers of positive and negative signs of the difference of Psychology scores between the two seasons are found. The significance value or p-value is found by using the Binomial distribution. The null hypothesis gets rejected for the p-value less than 0.05, otherwise for p greater than 0.05 null hypothesis gets failed to be rejected.
The Shapiro-Wilk test and Box-plot of the two set of data are checked for nature of data and validity of the method.
This test is nonparametric and used as an alternative to the two-sample t-test. If the population distribution does not follow the normal distribution, the Wilcoxon rank sum test is very useful in comparing two independent samples. For comparing two samples for repeated measures, Wilcoxon sign test is used.
- Assumptions of Wilcoxon test are as follows,
- The samples should be independent in nature that is paired sample data are independently drawn in a random manner.
- As Wilcoxon Sign test checks the data for a pre and a post situation, the sample data needs to be dependent on each other.
- The measurement of the dependent variables should be ideally ordinal or in ratio scale.
- The variables of the test should have continuity for the data set.
Example: Comparison of gambling playing intentions by gambling scores of the control group and experimental group where the experimental group is expected to be lured with near win situations is a case for Wilcoxon test.
The null and alternate hypotheses for the test are as follows,
Null hypothesis: H0: The control and experimental group have no difference in gambling playing intentions based on average gambling scores.
Alternate hypothesis: HA: There is a significant difference in gambling playing intentions for both the groups.
The validity of the test is established by the test statistic, W is calculated from the sum of the sign of ranks of the sample data. The z-score is found by the ratio of W and standard deviation of W. the p-value signifies the acceptance of the result. If p-value less than 0.05, the null hypothesis gets rejected. Otherwise, for p-value greater than 0.05, the null hypothesis fails to get rejected (Bethea, 2018). In post hoc analysis using the Bonferroni test, the validity of the data is checked.
Bethea, R.M., 2018. Statistical methods for engineers and scientists. Routledge.
Meyers, L.S., Gamst, G. and Guarino, A.J., 2016. Applied multivariate research: Design and interpretation. Sage publications.