The purpose of this assignment is to use statistical techniques and Excel skills learned in this course to analyze a data set and create a report summarizing the results. Several data sets on a variety of subjects are available on Blackboard for you to choose from. Alternatively, you may choose one of your own on a topic that interests you from other courses, research, work, or personal interest. (There is plenty of COVID 19 data readily available right now). If you use a different data set, you must first make sure that you have permission to use the data, then check with me to make sure it is appropriate for the assignment. You should also be aware that you may not submit an assignment for credit to multiple courses, so if you choose to analyze data you have used in another course you must make sure that this assignment is substantially different than what you did in the other course. Again, check with me before proceeding. Once you have selected your data, you should begin planning how you are going to analyze it. Think of questions you might ask of your data, such as • Is there a relationship between variables? • Is there a difference between groups? • Are there patterns over space or time? Some of the data sets contain several variables. You do not need to analyze all of them. Just choose the ones that you want to focus on. One way to approach this assignment is to go through the course outline and consider how each topic we have covered can be applied in the context of your data. e.g. • What kind of data do you have? (spatial, non-spatial, level of measurement?) Is it a sample or a census? How many values do you have to analyze, and what kinds of statistics are appropriate? If you have population data, could you create a random sample to analyze? Or multiple samples? Separate it into groups and test for differences? • What kinds of tables and graphs would be best to visualize the data? • What descriptive statistics will you calculate? • Do you need to know the distribution of your data—(normal, or other)? If so, how will you determine that? • Are there any outliers, and if so, how will you deal with them? • Could you do a correlation/regression analysis? • Would calculating Z scores and determining probabilities be useful? • Are confidence intervals needed? • What kind of hypothesis test is most appropriate (test for mean, test for differences, chi-squared)? • Do you have spatial data? Are there special techniques that could, or should be applied? How will you visualize your data? Can you identify spatial distribution patterns or calculate geographic centres? While you should consider which of the above are possible, you should only apply the techniques that are necessary. Not every technique is meaningful for every data set! Your analysis must include • descriptive statistics, • one or more graphs, • at least one hypothesis test. Beyond that, you should choose any other analyses that you think are appropriate. Do not attempt to cover all bases and do one of everything—part of the exercise is knowing which statistics are appropriate or necessary. You could think of this project as though you are working in your future career, and your boss hands you a data set and asks you to figure out what the data show. You may conduct many analyses but if they don’t show anything useful, do not include them in your final report. (e.g. you wouldn’t need both a pie chart and a bar chart of the same data, nor do you need to show frequencies, relative frequencies, and percentage frequencies on a single graph—only one is needed to show the patterns). Please note, your goal is not to find some earth-shattering result or new insight. It is quite possible you will find no correlation, weak relationships, or “statistically insignificant” differences. Such findings are equally valid and important! You will be marked on how you conducted your analysis, not on how exciting the results may or may-not be. In terms of work required, the analysis part of this assignment should be similar to one or two longer assignments like you have been doing all semester. It is helpful to explore multiple possible analyses to see if they reveal anything, but ultimately, you will only need to keep a few most significant analyses. What those are depend on your data and what you find. The difference from the weekly assignments is that you will need to put work into figuring out how to analyze your data, and then prepare a report that summarizes your main findings. The report will be in the form of a PowerPoint presentation. Although you will not be presenting it to the class, you should create it as if you were preparing for a professional presentation of your results—to your boss, or co-workers, for example. Aim for approximately 15-20 slides, but this can vary. You can record narration in your presentation if you choose, but it is not necessary. Make sure you provide enough information so the person viewing your report can understand the main points, but do not overwhelm them with excess text or unnecessary details. Visuals such as tables and graphs with very brief descriptions highlighting the main aspects of each are very useful! Your report must include • an introduction (a statement of what you set out to determine in your analysis, or the questions you asked) • a short description of the data (what is it, who collected it, when, how, why…). You may not be able to determine all of these answers—that is fine. Just be honest about what you know about the data. • clear and concise communication of your main findings with tables, graphs, etc. Use point form to summarize the main results. You just need to communicate what analysis you did, and what you found; there is no need for long paragraphs of text. • a brief discussion of the limitations of your analysis. Were any assumptions violated? Were there known data errors or missing data? • conclusion You must also submit your Excel workbook showing all your calculations and analyses. Anyone viewing your Excel file should be able to understand what you did, so your Excel workbooks need to be well organized, clearly labelled, and contain only the necessary work. You may wish to work with a ‘rough’ and ‘good’ copy. Use multiple worksheets in the file to clearly identify and separate the components of your analysis. • Name all worksheets with descriptive titles (e.g. Descriptive Stats, Regression Analysis, Hypothesis Tests). • Use tables with clear headings for calculations. Avoid doing unidentified calculations in random locations. • Graphs need to have titles, axes titles, labels, units, legends etc.