The major aims of this module are threefold:
You will be required to use specific tools to achieve this:
In addition, you will complete a set of quizzes that test your knowledge of the background reading, and you will evaluate your group members’ contribution to the final deliverable.
Caution: in case of ambiguity in either the problem statement or assessment criteria, the interpretation of the assignment tutor will apply. As such, if any statement is unclear, seek clarification rather than make assumptions, as your assumptions are likely to differ from the tutor’s interpretation.
Deliverables
Part A–Research Infrastructure and Process
(b) Project Plan and Trello board (due 2020-11-13)
I have several Trello accounts that I use for different purposes; youneed to invite j.noll@ herts.ac.uk, which is the one out marking scripts use to download your board.
in your Trello Backlog column, one card per task.
Caution: test this URL by having one of your group members who is not the board owner use it to access the board. If we can’t access the board, you get zero (0) credit.
(c) Git repository.
(a) Update the Trello board frequently (at least every week) with progress.
(b) Update the artifacts in your Git repository regularly (weekly at least).
The Trello board and Git repository will be checked randomly throughout the term. Marks will depend on regular, meaningful activity.
Part B–Research Question and Answer
IMPORTANT! You must use the exact names for files specified below. Assignments will be marked automatically by simple Unix scripts, so if you don’t name your files in the way the script expects, you will get zero (0) credit for that assignment.
(a) Choose a dataset from www.kaggle.com. Be sure your choice allows you to ask an interesting question that can be answered via correlation analysis, comparison of means, or comparison of proportions.
(b) Commit and push your dataset (in CSV format) to GitHub.
(c) Formulate one research question that can be answered using correlation analysis, comparison of means, or comparison of proportions.
(d) Specify the null and alternative hypotheses for your research questions.
(e) Write your research question, null, and alternative hypotheses, using correctlyspelled, correctly-punctuated, grammatically correct English, in a plain text, Markdown, or LaTeX file (NO Microsoft Word!) called “research_questions.txt” (for plain text), “research_questions.md” (for Markdown), or “research_questions.tex” (for LaTeX).
(f) Commit and push your file to BitBucket by.
(a) Create an R script called “visualization.R” that will load your dataset, create an appropriate visualization of your data, and output result in a filecalled “visualization.pdf.”
(b) Commit and push your “visualization.R” file to GitHub by 23:59 on 2020- 12-11. Do NOT commit “visualization.pdf”; your R script will create it for us.
(a) Create an R script called “analysis.R” that computes appropriate statistics to test your hypotheses. Commit and push this file to GitHub by 23:59 on 2021-01-08.
(b) Write a report, in correctly-spelled, correctly-punctuated, grammatically correct English, using Markdown, LaTeX, or Microsoft Word (Word is OK for this deliverable), comprising the following sections:
Approaching expectations, but some mistakes or omissions.
Examples: Solutions identify the main concepts. Writing may use colloquialisms, but is understandable and mostly free of grammar, punctuation, and spelling errors. References (when needed) are not cited correctly. Diagrams have correct syntax, are readable, and identify the main concepts or interactions. Project plan is updated sporadically.
Marginal fail: Some correct performance, emerging understanding, but mastery not thorough and there are numerous mistakes or omissions. Examples: Solutions in general are missing important elements, and/or have errors. Diagrams have errors and omissions, but show some understanding of the core concepts. Writing lacks focus, uses colloquialisms, is repetitive, and/or contains numerous grammatical, spelling, and punctuation errors. References are missing or not cited correctly. Project plan is not maintained in a way that appears to be useful.