For this project, you must find some sort of published, existing data. Possible sources include almanacs, magazine, journal articles, textbooks, web resources, athletic teams, newspapers, reference materials, campus organizations, professors with experimental data, electronic data repositories, the sports pages or collect your own data from fellow students, neighbours or friends. The dataset you select must have at least 25 cases. It also must have at least two categorical variables and at least two quantitative variables. Choose or collect a dataset that interests you!
See the description below of what analysis should be included. Use technology to automate calculations Write Your Report Cut and paste all relevant computer output with your analysis. Be sure to include both computer output and your discussion of that output in every case. As you discuss each analysis, be sure to interpret what you are finding in the context of your particular data situation. Include all of the following.
How did you find or collect your data (If you found the data, give a clear reference. If you collected the data, describe clearly the data collection process you used.) What are the cases What are the variables What population do you believe the sample might.generalize to Is the sample data from an experiment or an observational study Include a copy of the dataset.
• Analysis of One Quantitative Variable: For at least one of the quantitative variables, include summary statistics (mean, standard deviation, five number summary) and at least one graphical display. Are there any outliers Is the distribution symmetric, skewed, or some other shape
• Analysis of One Categorical Variable: For at least one of the categorical variables, include a frequency table and a relative frequency table.
• Analysis of One Relationship between Two Categorical Variables: Analyse your own data for a chi-square test for association between the two Categorical Variables. State the hypotheses of the test. Conduct the test, showing all details such as expected counts, contribution of each cell to the chi-square statistic, degrees of freedom used, and the p-value. State a clear conclusion in context. If the results are significant, which cells contribute the most to the chi-square statistic For these cells, are the observed counts greater than or less than expected Whether or not the results are significant, describe the relationship as if you were writing an article for your campus paper. If the results are significant, can we infer a causal relationship between the variables.
• Analysis of One Relationship between a Categorical Variable and a Quantitative Variable Include a side-by-side histogram and describe it. Does there appear to be an association between the two variables If so, describe it. Also, use some summary statistics to compare the groups.
• Analysis of One Relationship between Two Quantitative Variables: For at least one pair of quantitative variables, include a scatterplot and discuss it.
• Conclusion: Briefly summarize the most interesting features of your data.