CS5250 Data Visualisation and Exploratory Analysis Project
Question:
This project will count for 10% of the total marks for the course CS5250. The project will offer you anopportunity to carry out an in-depth exploratory analysis of an interesting dataset that provides insights of the data that were otherwise hidden or non-obvious.
Learning outcome assessed
· Perform open-ended exploratory analysis of data, and master the analytical presentation and critical evaluation of the results of statistical analyses.
· Be able to demonstrate practical experience of using standard graph visualisation methods and evaluation of results.
· Be able to critically assess and evaluate a visualisation.
There are TWO parts in the project: 10 marks in total - 8 marks for Part 1 and 2 marks for Part 2. The project will be evaluated based on:
· Significance/Novelty. Is the analysis “real” and “interesting”, or just a “toy” analysis? How original, important and well defined are the questions posed? Is the analysis likely to be useful and/or have impact?
· Technical quality. Is the exploratory data analysis approach and methods appropriate and well described? Are sufficient details provided? Are the data visualisation methods creative and interesting? Is the interpretation (discussion and conclusion) well balanced and supported by the data?
· Organisation: Is the report well organised? Is the write-up clear and the language adequate? Are results presented in the most appropriate manner?
You should
· think about what story (or stories) you would like to tell;
· pose two significant questions that you would like to answer;
· assess the fitness of the data for answering your questions;
· produce at least 2 static visualisations for addressing your questions.
Create a presentation visualisation (or more than one if absolutely needed) for an argument based on the given dataset. That is, choose some data or information you are interested in (maybe not much), and construct a presentational visualisation that would fit on one PowerPoint slide, which makes an argument (that you choose).
Your task is to pre-process the dataset as necessary using R, design at least 2 static visualisations that you believe effectively communicates this data and provide a short write-up (no more than 4 paragraphs) describing your design. While you must use the dataset given, note that you are free to filter, transform and augment the data as you see fit to highlight the elements that you think are most important in the dataset.
The argument should be in a paragraph separate from the slide. You should put your slide in your report.
Submission
Please submit the following 3 parts separately using the course Moodle submission link:
· Dataset after pre-processing in CSV format;
· R code used for the project;
· Project report in PDF format.