Securing Higher Grades Costing Your Pocket? Book Your Assignment at The Lowest Price Now!
loader
Add File

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!

Stuck on Your Question?

Get 24x7 live help from our Top Tutors. All subjects covered.

loader
250 words

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Students Who Viewed This Also Studied

8 Pages
INFOST110 Introduction to Information Science and Technology

Task: Use the mastery you’ve demonstrated in previous competency sets to analyze the data sets and create visualizations for the data sets.  For each data set submit both your calculations and the ...

Course

INFOST110

Type

Course Work

Subject

Statistics

University

University of Wisconsin-Milwaukee

23 Pages
STAT 101 Introduction to Statistics

Question: Task Part 1 (work on Matlab as instruction, do source code in .m file and a separate file with figures):1.Use an edited version of the scGEAToolbox (Cai (2020) Bioinformatics) to analyz ...

Course

STAT101

Type

Assignment

Subject

Statistics

University

University of Illinois at Chicago

Season

Spring

26 Pages
IMAT5264 Business Intelligence Application and Development

Question: The intended Learning Outcomes (LOs) that are assessed by this coursework are: LO1: Identify the required components of a BI system by systematic analysis of a perceived problem area, ap ...

Type

Assignment

Subject

Statistics

University

De Montfort University

147 Pages
Cluster Hierarchical Clustering

Project 1 1.1 Descriptive analysis Descriptive statistics is the summarization of data in a coordinated manner by specifying the relationship between variables in a population or sample (K ...

Course

MANM317

Subject

Statistics

University

University of Surrey

Season

Spring

CMI3507 Data Mining

Question

Answered

Question:

Requesting a Late Submission You are reminded to ‘back-up’ your work as late submission requests will not be given for lost work, which includes work lost due to hardware and software failure/s.

Late submission requests will only be approved if you can demonstrate genuine, unexpected circumstances along with independent supporting evidence (e.g. medical certificate) that may prevent you submitting an assessment on time.

Submit your request for Late Submission via University website within 2 working days of the due date.

Late submission requests, up to a maximum of 10 working days, but typically 1- 5 working days, will be considered provided that there is appropriate evidence which clearly indicates reasons for the request.

You will have 5 working days after submitting a request to provide the evidence. Failure to submit evidence will result in the request being rejected and your work being marked as a late submission (see below).

If you are unable to submit work within the maximum late submission period of 10 days, contact the School’s Guidance Team. as you may need to submit a claim for Extenuating Circumstances (ECs).

Extenuating Circumstances (ECs) An EC claim is appropriate in exceptional circumstances, when an extension is not sufficient due to the nature of the request, or it concerns an examination or In-Class Test (ICT).

You can access on the Registry website; where you can also find out more about the process.

You will need to submit independent, verifiable evidence for your claim to be considered.

Once your EC claim has been reviewed you will get an EC outcome email from Registry. If you are unsure what it means or what you need to do next, please speak to the

An approved EC will extend the submission date to the next assessment period (e.g July resit period).

Late Submission (No ECs approved) Late submission, up to 5 working days, of the assessment submission deadline, will result in your grade being capped to a maximum of a pass mark.

Assignment Title

Data mining is a collection of tools, methods and statistical techniques for exploring and extracting meaningful information from large data sets. It is a rapidly growing field due to the increasing quantity of data gathered by organisations. This module looks at different data mining techniques and gives students the chance to use appropriate data-mining tools in order to evaluate the quality of the discovered knowledge.

2.Learning Outcomes:

This assessment consists of a contribution to an evaluative report (worth of 90% of your total marks).

Learners will be able to justify and critically discuss the key concepts of data mining (including legal implications such as GDPR) and the breadth of areas of application.

The learner will be able to make appropriate modifications to large datasets to prepare the data for analysis and exploration; select appropriate data mining techniques in order also to enable exploration of large data sets; interpret and evaluate the results of the analysis to draw conclusions and make informed decisions.

Learning outcomes covered in this coursework are as follows:

1.Knowledge of the underlying principles and general data analytic modelling and study of relationships in data, and visualization, and potential knowledge of other but quite possibly related domains, like statistics, and machine learning.

2.Ability to select and apply appropriate data analytics techniques for problem solving.

3.Ability to explain and describe clearly all aspects of the reasoning.

4.Assessment Brief

Individual piece of work. Demonstrate comprehensive knowledge and critical understanding of the use of data analysis to create a solution to a given problem, and that implies data mining for interpretation and for what can be following from data mining.

Evaluative report. You must choose from one of the following tasks:

1)Students performance dataset (available on Brightspace and UCI repository).

Q1. Study the dataset: find its size, number and describe the type of variables. Check if there’s any data missing (if yes, apply an appropriate cleaning technique). Perform a descriptive statistical analysis of the dataset: choose a range of the variables of your interest, find their frequencies and dependencies through bar plots, grouped bar plots, pie-charts, etc.. Draw conclusions.

Advanced: Perform a factor analysis. Comment on your findings.

Q2. Split the dataset on training and testing parts. Build a Random Forest Regression model (using randomForest R library) to predict a final year grade (G3). Evaluate your model using a test dataset.

Plot an importance graph. Estimate accuracy. Comment on your results.

Advanced: Divide the students into 3 categories: poor achieving students, average achieving, well achieving (based on the final grade). Build a classification Random

Forest model. Evaluate your model using test dataset. Print confusion matrix. Build conclusions.

Recommended for reading: Breiman, L., (2001). Random Forests. Machine Learning. 45(1), 5–32. Available from: doi: 10.1023/A:1010933404324.

2)Heart failure clinical records Data Set (available on Brightspace and UCI repository) Q1. Study the dataset: find its size, number and describe the type of variables. Check if there’s any data missing (if yes, apply an appropriate cleaning technique). Perform a descriptive statistical analysis of the dataset: choose a range of the variables of your interest, find their frequencies and dependencies through bar plots, grouped bar plots, pie-charts, etc.. Draw conclusions.

Advanced: Perform a factor analysis. Comment on your findings.

Q2. Split the dataset on training and testing parts. Build a Neural Network (using neuralnet R library. Start with two hidden layers size of 5 and 3) to predict if a risk of death (died/alive binary outcome). Evaluate the model using test dataset. Print confusion matrix. Draw conclusions.

Advanced: Experiment with parameters of the neuralnet function. For example, use a different number/different size of hidden layers or different activation functions. Compare your results with the original model. Draw conclusions.

Recommended for reading: Riedmiller, M. and Braun, H. (1993) A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. Proceedings of the IEEE International Conference on Neural Networks, San Francisco, 28 March-1 April 1993, 586-591. Available from: doi: 10.1109/ICNN.1993.298623

3)On-line retail dataset (available on Brightspace and UCI repository)

Q1. Study the dataset: find its size, number and describe the type of variables. Check if there’s any data missing (if yes, apply an appropriate cleaning technique). Perform a simple statistical analysis, for example: what countries do customers come from? What is the range of recorded times? What is the average spending? Is there any difference in spending for customers from different countries? Are there any preferences in meals? Draw conclusions.

Advanced: perform a pattern mining analysis: either frequent patterns or association rules. Comment on the patterns.

Q2. Perform clustering on customers and customer baskets. Notice, that you need to reorganise the dataset, so that each row of your data frame would contain products purchased by a single customer. You can include or exclude the information about countries. Comment on the choice of distance and the results.

Advanced: Experiment with the number of clusters. Study the indexes which valuate the clustering (such as silhouette, elbow method or Dunn). You may wish to look at the libraries NbClust or clValid.

Structure of the evaluation report:

1.Title with student’s name, name of the chosen dataset and the

corresponding Data Mining method.

2.Introduction which contains a short description of the chosen method.

3.Answers on the stated questions and conclusions.

4.A literature review which should include the reference to the original method, its extensions and improvements (if applicable) and a few recent applications of the method. You must use APA 6th style for referencing.

5.Appendix which must include R commands you have used in your analysis All plots, figures and graphs must be enumerated and have clear labels.

6.Grading Rubric

Portfolio. This is a written report on the requested analysis carried out on a provided data set (see a set of problems from section 2). In this part learning outcomes 3 and 4 are to be assessed, which include:

1)Presentational skills (Portfolio is well structured, text and diagrams are neat, legible and free from errors)

2)Knowledge of subject (understanding of data mining methodologies; application of data mining methodologies to real-life datasets; use of appropriate statistical software; interpretation of outputs; literature review).

3)Programming skills (R code is neat and free from error).

CMI3507 Data Mining

Answer in Detail

Solved by qualified expert

Get Access to This Answer

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Hac habitasse platea dictumst vestibulum rhoncus est pellentesque. Amet dictum sit amet justo donec enim diam vulputate ut. Neque convallis a cras semper auctor neque vitae. Elit at imperdiet dui accumsan. Nisl condimentum id venenatis a condimentum vitae sapien pellentesque. Imperdiet massa tincidunt nunc pulvinar sapien et ligula. Malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Et ultrices neque ornare aenean euismod. Suscipit tellus mauris a diam maecenas sed enim. Potenti nullam ac tortor vitae purus faucibus ornare. Morbi tristique senectus et netus et malesuada. Morbi tristique senectus et netus et malesuada. Tellus pellentesque eu tincidunt tortor aliquam. Sit amet purus gravida quis blandit. Nec feugiat in fermentum posuere urna. Vel orci porta non pulvinar neque laoreet suspendisse interdum. Ultricies tristique nulla aliquet enim tortor at auctor urna. Orci sagittis eu volutpat odio facilisis mauris sit amet.

Tellus molestie nunc non blandit massa enim nec dui. Tellus molestie nunc non blandit massa enim nec dui. Ac tortor vitae purus faucibus ornare suspendisse sed nisi. Pharetra et ultrices neque ornare aenean euismod. Pretium viverra suspendisse potenti nullam ac tortor vitae. Morbi quis commodo odio aenean sed. At consectetur lorem donec massa sapien faucibus et. Nisi quis eleifend quam adipiscing vitae proin sagittis nisl rhoncus. Duis at tellus at urna condimentum mattis pellentesque. Vivamus at augue eget arcu dictum varius duis at. Justo donec enim diam vulputate ut. Blandit libero volutpat sed cras ornare arcu. Ac felis donec et odio pellentesque diam volutpat commodo. Convallis a cras semper auctor neque. Tempus iaculis urna id volutpat lacus. Tortor consequat id porta nibh.

198 More Pages to Come in This Document. Get access to the complete answer.

MyAssignmenthelp.com is one of the leading urgent assignment help providers in the USA. We have earned our reputation as best assignment help in multiple countries including the USA. We have designed unique fastest delivery options, which assist us to deliver immediate assignment assistance. Our teams of highly skilled qualified writers are capable of delivering fast assistances. We provide online assignment help to a wide range subjects so that whenever students face the urgent need of assignment help, they can hire our assistance within a short period.

More CMI3507 CMI3507 Data Mining: Questions & Answers

Q
icon

We aren't endorsed by this University

INFOST110 Introduction to Information Science and Technology

Task: Use the mastery you’ve demonstrated in previous competency sets to analyze the data sets and create visualizations for the data sets.  For each data set submit both your calculations and the final set of graphics ready for presentation.  Also submit the visualizations embedded into their ...

View Answer
Q
icon

We aren't endorsed by this University

STAT 101 Introduction to Statistics

Question: Task Part 1 (work on Matlab as instruction, do source code in .m file and a separate file with figures):1.Use an edited version of the scGEAToolbox (Cai (2020) Bioinformatics) to analyze single-cell RNA-sequencing data.  To apply the functions included in the scGEAToolbox, refer ...

View Answer
Q
icon

We aren't endorsed by this University

IMAT5264 Business Intelligence Application and Development

Question: The intended Learning Outcomes (LOs) that are assessed by this coursework are: LO1: Identify the required components of a BI system by systematic analysis of a perceived problem area, appraisal of available techniques and tools, and the critical evaluation of developed systems. LO2: C ...

View Answer
Q
icon

We aren't endorsed by this University

Cluster Hierarchical Clustering

Project 1 1.1 Descriptive analysis Descriptive statistics is the summarization of data in a coordinated manner by specifying the relationship between variables in a population or sample (Kaur, Stoltzfus & Yellapu 2018) . There are 10 variables in the data set office.csv as shown in th ...

View Answer

Content Removal Request

If you are the original writer of this content and no longer wish to have your work published on Myassignmenthelp.com then please raise the content removal request.

Choose Our Best Expert to Help You

icon

5% Cashback

On APP - grab it while it lasts!

Download app now (or) Scan the QR code

*Offer eligible for first 3 orders ordered through app!

screener
ribbon
callback request mobile
Have any Query?