The purpose of this assignment is to and foster your ability to critically evaluate organizational approaches to data collection, management, analysis, communication and application of data analytic techniques introduced in the module, in order to generate actionable insights within the environment of your organization. The report you submit should be 2,500 words in length. You are allowed to consult any sources (e.g. books, research papers, reports), but please make sure you provide proper references and adhere to the university’s academic integrity, authorship and plagiarism guidelines (you may find them here).
Identify a problem of your choice within your organization. It is important that the problem you choose would make for a good data analytics application – i.e. that you can see how this problem would be more successfully tackled using data analytic techniques. Describe the problem as accurately as possible, using your own experience and understanding, and making sure you explain why this is an important problem, what the benefits of the solution will be for your organization. Your description should be as complete and precise as possible, referring to the concepts introduced in class and/or the book. Identify your objective and the business decision you are facing. Then, describe the data that you will use to support your decision. Make sure you mention where the data is procured from (i.e. from within the organization or from the outside, whether it is proprietary data or publicly available etc.).
Describe the characteristics of the dataset, the variable types and what information you can get from the features.
In the context of the data mining process discussed in class, explain, step by step, how you would develop a data driven decision making solution to this problem. At each step, explain the various tasks you have to perform, the different techniques, models etc. For every modelling task, identify your target variable and explain why it is appropriate for your objective and how its characteristics lead you to consider the particular model(s). Explain how you would perform model selection and evaluation and how you would try to address any potential problems you identify at this stage.
Finally, describe the deployment stage of your application, e.g. how do you expect your solution to work and what you would do if it does not perform as expected.
Make sure that in your report you include answers to the following questions:
- What exactly is the business decision you want to support with this solution?
- Why did you select this as a good data driven decision making problem?
- How and where would you get the data?
- Explain precisely why and how you expect your solution will add
- Explain all of the steps towards the deployment of your
- What is your targetvariable?
- What type of data analytic task(s) do you need toperform?
- What are the features? Provide a list of at least 5 features that you think (a) you can get and (b) you think might be
- What exactly would be your trainingdata?
- How will you evaluate model performance?
- Describe the deployment