MKTG4110 Marketing Analytics
Task:
Question 1. Cluster Analysis and Segmentation: calculation
Suppose that we are trying to segment some consumers into groups using information we have collected through survey data (Product knowledge Score, Willingness to Pay, Tendency to Dropout), all the three variables are on a scale from 1 to 10. In the calculation of Euclidean distance, please use the original data (no standardization needed). Consumer Product Knowledge Score Willingness to Pay Tendency to Dropout
1 3 10 2
2 9 8 9
3 6 4 5
4 4 6 7
5 8 1 1
a. Please calculate the Euclidean distance between customer 3 and 5. Please write your answer as an R comment.
b. Suppose that after running a k means cluster analysis, we found that customer 1 and 5 belong to cluster one and customer 2, 3 and 4 belong to group two. What are the centroids for each cluster? Please write your answer as an R comment.
Question 2. Cluster Analysis and Segmentation: R application
As a wholesale distributor, we have put together a data set about the spending of our clients on various product categories. We plan to segment those customers into groups using k means clustering method (please refer to our in-class example for k means clustering procedure). For the definition of each variable in the data set, go to:Please download the file named ‘hw2 q2 data.csv’ from canvas and import it into RStudio
a. Use the k means method to cluster the customers (each line of data belongs to one customer) into groups with k = 2 to 10. Please use the following variables for clustering: Fresh, Milk, Grocery, Frozen, Detergent_Paper and Delicatessen (column 3 to 8, just use the original variables for clustering, no standardization needed). Set the seed to be 12345.
b. Plot the ratio of within-cluster sum of squared over between-cluster sum of squares against number of clusters (k). Use the elbow criterion to suggest the best number of clusters. For the suggested k, add the clustering results (the clusters that customers belong respectively) back to the dataset as a new variable. Then create boxplots for each factor by segments.
Question 3. Conjoint Analysis
As an Internet service provider, we are very interested in our consumers’ preferences and their tradeoffs among different product features. Please download the file named ‘hw2 q3 ata.csv’from Canvas and import it into RStudio. Our products have three attributes: price, speed and cap. Below is the table of the three attributes and respective attribute levels. Price Speed Cap
$49 25 Mbps Yes
$99 100 Mbps No
1000 Mbps
a. Please use the data you have imported to conduct a conjoint analysis to calculate the part-worths for attribute levels.
b. Calculate relative importance of the three attributes with part-worth from a. What is the total utility of this profile {$49, 100 Mbps, No cap}? What about {$99, 1000 Mbps, with cap}?