The dataset (CreditData.csv) classifies customers as “approved” or “not approved” (Yes or No) (i.e., target class).
? The target class is in the 21st column and its name is “Approved”.
? Number of Attributes for Classification: 20 (7 numerical, 13 categorical).
? The task should be developed using R (and in RStudio).
Tasks: 1- Divide data into two datasets
• 80% as training data
• 20% as test data
Note: Use this link to learn how to divide one dataset into training and test data: https://rpubs.com/ID_Tech/S1
2- Build a classification model based on the training data to predict if a new customer is approved or not.
• You can use Regression or Decision Tree (or both to learn more!).
3- Test the model on the test data.
4- Explain the model that you build, create the confusion matrix, and report its accuracy, precision, and recall.
• If you use decision tree, draw the tree.
• If you use regression, report the parameters and weight values.
Deliverables: 1- Source code (copy the R source code in a .txt file and upload .txt file in D2L)
• Note: D2L may not let you upload a file with .R extension
2- The answer to question 4 as a PDF file.
Dataset Description: Here is the attribute description for the dataset:
Attribute 1: (qualitative)
Status of existing checking account
• A11: balance = $0
A12: balance ≤ $200K
• A13: balance > $200K
• A14: no checking account
Attribute 2: (numerical)
Duration of bank membership in month
Attribute 3: (qualitative)
Credit history
• A30: no credits taken/all credits paid back duly
• A31: all credits at this bank paid back duly
• A32: existing credits paid back duly till now
• A33: delay in paying off in the past
• A34: critical account/other credits existing (not at this bank)
Attribute 4: (qualitative)
Purpose of applying for a loan
• A40: car (new)
• A41: car (used)
• A42: furniture/equipment
• A43: radio/television
• A44: domestic appliances
• A45: repairs
• A46: education
• A47: vacation
Attribute 5: (numerical)
Credit amount
Attribute 6: (qualitative)
Savings account/bonds
• A61: value < $10K
• A62: $10K ≤ value < $50K
• A63: $50K ≤ value < $100K
• A64: value ≥ $100K
• A65: unknown/ no savings account
Attribute 7: (qualitative)
Present employment since
• A71: unemployed
• A72: employment period < 1 year
• A73: 1 ≤ employment period < 4 years
74: 4 ≤ employment period < 7 years
• A75: employment period ≥ 7 years
Attribute 8: (numerical)
Installment rate in percentage of disposable income
Attribute 9: (qualitative)
Personal status and sex
• A91: male and married/divorced/separated
• A92: female and married/divorced/separated
• A93: male and single
• A94: female and single
Attribute 10: (qualitative)
Other debtors / guarantors
• A101: none
• A102: co-applicant
• A103: guarantor