This coursework will count for 10% of the total marks for the course, and half of the total assessed coursework marks for the course. Learning outcome assessed
Be able to perform open-ended exploratory analysis of data, and master the analytical presentation and critical evaluation of the results of statistical analyses.
Be able to demonstrate practical experience of using standard graph visualisation methods and evaluation of results.
Be able to critically assess and evaluate a visualisation.
Part 1. Diamonds dataset
In assignment 1, you have investigated the diamonds dataset. How could you find out using appropriate visualisations, from the diamonds dataset, whether diamonds that weigh slightly more than one carat are significantly more valuable than diamonds that weigh just less than one carat? Show your visualisations with appropriate explanation to support your answers.
Part 2.
Download the dataset “athlete_events.csv” from the course Moodle page. This is a historical dataset on the modern Olympic Games, including all the Games from Athens 1896 to Rio 2016. Each row is an athlete-event. The ID column can be used to uniquely identify athletes, since some athletes have the same name. NOC stands for National Olympic Committee.
On the basis of data from the Olympic Games, you need to create an informative graph include the following elements:
Visualise the relationship between height and weight for both men and women.
You can make use of the full dataset or you can make a selection (e.g., focussing on only one year/country/sport).
Explain in no more than 250 words (half a side of A4) what the graph is showing, and whatunique in sights it delivers. Also reflect on what it fails to show or what you would have liked to include in the graph but was not able to.
Create appropriate axis-labels and titles. Appropriate breaks/limits/labels are encouraged.
Part 3
This is a conceptual question. Roughly one page of A4 please, definitely not more than 2 pages, but write as carefully and precisely as you can. (This is to give you practice in answering examstyle questions.) The aim of the question is to get you thinking about the difficulties of analysing observational data (the database described below is observational), and also some of the difficulties of carrying out experiments in practice.
There are ten coffee shops, each with its own manager. You – as the central owner – set a ‘menu’ of 30 different items that managers can sell: on each day, each manager chooses which ingredients to order and which items the staff in that shop will make. At any time, some of the staff in the shop are selling to customers, others are working the coffee machines, and others are making the food or clearing up.
Each manager selects which items from the ‘menu’ to sell in his or her shop on any particular day (the staff in the shop make the sandwiches, and they can only make a limited number of different sandwiches on each day).
The shops are in different types of location (stations, streets, shopping malls) and each manager selects the items that they think will sell best in their location, and on the particular day.
a) Summarising the total sales by server shows that some servers sell far more than others. Should you get rid of the servers who don’t sell much, or do you need more information? Explain carefully.
b) You suspect that some servers persuade customers to buy extra items, and so generate you more profits. Describe how you might investigate this from the data.
c) You believe that some managers are better than others, as some of the shops are much more profitable than others, and some managers change their menus more often than others. Should you sack your less profitable managers, or do you need more information? Explain carefully.
d) Could you devise an experiment to assess which managers are more competent? Describe how you might do this. Do you think the experiment is feasible?
e) Some types of sandwich have higher sales than others. Do you think it is reasonable to eliminate the sandwiches that sell least in total? Can you devise an experiment to find out whether different sandwiches sell at different rates? Would the results of this experiment directly tell you whether to stop selling some types of sandwich? (Hint: this requires careful thought!)
Part 4
Find an example of a visualisation on the web which is intended to persuade you of something, or which is intended to be part of an argument. (That is, don’t choose visualisations that are really works of art with no particular purpose.)
As before, please include the following in your submission:
An image of the visualisation;
A link to where you found the visualisation;
A short (less than half a side of A4) discussion of what the visualisation is intended to persuade people of, whether you think it succeeds, and why.
Marks will be given for:
Choice of substantial or unusual visualisation and correct description of the visualisation.