Introduction
The statistical analysis is very important for effective decision making. Now a day, it is observed that most of the organizations or business industries use the statistical analysis of the data generated from their industry or organization and the results of this statistical analysis used for making future policy. Here, we have to perform the statistical analysis for the data set and then we have to check some claims or hypotheses regarding the variables included in the given data set. For this statistical analysis we have to use different tools and techniques of the statistical analysis. We have to use the descriptive statistics and inferential statistics. Also, we have to use some graphical analysis for easy understanding of the concepts. By using the descriptive statistics we get the idea about the nature and spread of the variables in the data set, however inferential statistics or testing of hypothesis is very useful for checking different claims or hypotheses. Let us see the statistical analysis for the given data set.
Research Hypotheses
It is important to establish the research hypotheses for any research study and for this research study we have to check the following research hypotheses.
Whether the two categorical variables marital status of the person and the size of the home town are independent from each other or not?
Whether the two categorical variables size of the home town and age category are independent from each other or not?
Whether the two categorical variables home town and level of education are independent from each other or not?
Discussion and Statistical Analysis
From the frequency distribution for the size of the home town or city, it is observed that the frequency of the town with population under 10000 is given as 40, for population 10k to 100k, it is given as 190; for population 100k to 500k, it is observed as 246, for 500k to 1 million, it is 396; while there are 128 cities with population 1 million or more.
For the given data set, it is observed that there are 560 male and 440 female respondents in the collected data. Total number of respondents or observations is 1000. Out of these 1000 persons, 110 persons are unmarried while 890 persons are married. In most of the homes, two or three peoples are live in household. For the given data set, it is observed that the most of the frequency for the persons is given for the persons with age group 35 to 49. For this age group the frequency is given as 440. The frequency for the age group 18 to 24 is given as 20, the frequency for the age group 25 to 34 is given as 320, the frequency for the age group 50 to 64 is given as 145, while the frequency for the age group 65 or more is given as 75.
From the given table for frequency distribution for the level of education it is observed that there are 18 persons with less than high school education, 74 persons with high school diploma education, 275 persons with some college degree, 548 persons completed their college degree while 85 persons completed post graduate degree. The income distribution of the population in the data set follows an approximate normal distribution. About 21 persons in the given data set have income under $25k. About 163 persons in the data have income $25k to $49k. It is observed that 393 persons in the given data set have income of $50k to $74k. About 332 persons have income $75k to $125k. About 91 persons in the data set have income of $125k or more. More than half of the persons are agreed with the statement that they are worried about global warming.
Now, we have to check the hypothesis or claim whether the marital status of the person is independent from the size of the home town or not. For checking this hypothesis or claim, the chi square test of independence of two categorical variables is used. The p-value for this test is given as 0.00 which is less than the given level of significance or alpha value 0.05, so we conclude that two categorical variables marital status of the persons and size of the home town are not independent from each other. We have to use same Chi square test of independence of two categorical variables for checking the hypothesis or claim whether the age category and size of the home town are independent from each other or not. For this Chi square test, we get the p-value as 0.00 which indicate statistically significant result. There is insufficient evidence that the two categorical variables age category and size of the home town are independent from each other. The size of the home town and level of education are not independent variables for the given data. The chi square test for checking the independence between these two variables gives the p-value as 0.00 which is less than alpha value 0.05.
Conclusions
We conclude that two categorical variables marital status of the persons and size of the home town are not independent from each other.
There is insufficient evidence that the two categorical variables age category and size of the home town are independent from each other.
There is insufficient evidence that the size of the home town and level of education are independent variables for the given data.
References
Antony, J. (2003). Design of Experiments for Engineers and Scientists. Butterworth Limited.
Babbie, E. R. (2009). The Practice of Social Research. Wadsworth.
Bickel, P. J. and Doksum, K. A. (2000). Mathematical Statistics: Basic Ideas and Selected Topics, Vol I. Prentice Hall.
Casella, G. and Berger, R. L. (2002). Statistical Inference. Duxbury Press.
Cox, D. R. and Hinkley, D. V. (2000). Theoretical Statistics. Chapman and Hall Ltd.
Degroot, M. and Schervish, M. (2002). Probability and Statistics. Addison - Wesley.
Dobson, A. J. (2001). An introduction to generalized linear models. Chapman and Hall Ltd.
Evans, M. (2004). Probability and Statistics: The Science of Uncertainty. Freeman and Company.