## Overview of the Marketing Data Set

Discuss about the Descriptive Statistics and Visualization for Audience.

The process of using data visualization is for making general people to understand the significance of the general data that has been collected in the form of visual aids. This visualization is achieved with the help of patterns, data trends and correlations among the data, which will help in exposing the data that has been collected. Recognition of the data is made easier with the help of different data visualization tools.

For this blog, the data set taken into concern is the marketing data set of Tully’s coffee. The data set consists of data, which is classified in to the categories of profit incurred from the sale of the products, the net sale revenue collected form the products, the total sale cost collected and the cost of goods for the production of the products being sold. The data set also covers the marketing cost spent by the company for the marketing advertisements of the products and the amount of ready goods in the inventory of the company at different location. There is also the discussion of the projected values of the sale, profit, margin cost and the cost of goods for the company. The data has been divided up into regions with the help of area codes and product id. The market size and the regions based on the location of the market has also been divided in to east, west central and south. There is a discussion about 20 states from the United States of America. The products has been classified into Caffeinated and Decaffeinated and the type of the product that is being produced.

The following table gives an overview of the data that has been used in the data set. The names of the columns of the data set used has been discussed to provide a brief overview of the data set.

 FactTable Data type Profit Profit generated measured (in \$,000) = Margin - Total Expenses Discrete Numerical Margin Net sales revenue (in \$,000) = Sales - Cost of Goods Discrete Numerical Sales Operating revenue (in \$,000) earned from selling coffee/tea products Discrete Numerical COGS Cost of Goods (in \$,000) that is the direct costs attributable to the production of the goods sold by the company Discrete Numerical Total Expenses Total costs in (in \$,000) associated with managing and operating the business (COG not included) Discrete Numerical Marketing Marketing costs (in \$,000) Discrete Numerical Inventory Total value of products and goods (in \$,000) that are ready or for sale. Discrete Numerical Budget Profit Projected profit (in \$,000) Discrete Numerical Budget COGS Projected cost of goods (in \$,000) Discrete Numerical Budget Margin Projected margin (in \$,000) Discrete Numerical Budget Sales Projected sales (in \$,000) Discrete Numerical Area Code Codes assigned to areas for the reference of the company. Discrete ID Number Product ID Codes assigned to different products for the reference of the company Discrete ID Number Date Dates for the year 205 and 2016, the first day of each month Date type Location Area Code Codes assigned to areas for the reference of the company Discrete ID Number State States of the United States of America (20 Selected) Location Market Market region (East, West, Central, South) Discrete location Market Size Market size as a factor of population and demand (Major vs. Small) Discrete Data Product Product Type Type of coffee or tea sold (Coffee, Espresso, Tea, Herbal Tea) Discrete Type Product Product sub-category Discrete Type Product Id Codes assigned to different products for the reference of the company Discrete ID Number Type Caffeinated vs. Decaffeinated Discrete Type

Based on the data set the major columns, which has been selected for the analysis in the data visualization tool, are:

1. Total Expenses Plotted Against Market And State
2. Profit Plotted Against Market And State
3. Sales Plotted Against Market And State
4. Profit Plotted Against Products
5. Total Expenses Plotted Against Products
6. Budget Cost Of Goods Plotted Against Products
7. Budget Margin Plotted Against Products
8. Budget Profit Plotted Against Products
9. Budget Sales Plotted Against Products
10. Budget Cost Of Goods Plotted Against Locations
11. Budget Margin Of Goods Plotted Against Locations
12. Budget Profit Of Goods Plotted Against Locations
13. Budget Sales Of Goods Plotted Against Locations

The above graphs are the ones that has been decided to be plotted on the software. The factors chosen for the graphs help is understanding the revenue collection system of the company by analyzing the cost behind the production of the hoods and the final amount of money collected against the product or the location.

For the use of the data set in the development procedure of the report the ethical and the legal issues of the data set needs to be considered for keeping the data safe for the organization form whom the data has been collected. The law, which helps in keeping the data safe from the hacking and stealing procedure, is the empowerment of the law of Data Protection Act 1998. For the development of the report, the ethical issues, which were considered, are as follows:

1. For the collection of the data, the report the consent had been taken from the participants. The consent was targeted to let them know about the extent that their data is to be shared or viewed by others. It is also important to let the participant know about the extent of the sharing capability of the data set. The participants were made sure that they were completely willing that their data was being shared and used for the analysis in this report.
2. The data, which the participants refrained from sharing during the collection of the data set, was excluded from the analysis of the data set.
3. If the data is made available to be shared on the internet, it is the responsibility of the analyst to make the data anonymous as possible for the participant to have the faith in the ones for keeping the data safe.
4. For the collection of the personal data, it is important for the data to be collected based on the Data Protection Act 1998. Even if personal data has been collected during the data collection, procedures it is important to remove them during the analysis of the data set.

## Selected Columns for Data Analysis

The first dashboard produced consists of three graphs depicting the Total Expenses Plotted against Market and State, Profit Plotted against Market and State and Sales Plotted against Market and State. The graphs has been plotted with the help of column chart, line chart and Gantt chart respectively. The colors used are of the same hue but of different composition. This makes them all related to each other but keep themselves apart from the other graphs as well. The first graph of the dashboard, Total Expenses Plotted against Market and State shows the total amount of expenses that is incurred by the company for the production of the different products and to sell them in the respective market area and the respective state market of the country. On an overview, it can be said that the west market produces the maximum amount of expenses from the products. The south market can be said to produce the least amount of expenses for the company as a whole. Looking from the individual state wise analysis the state of New Hampshire produces the least amount of expenses and California on the other hand produces the highest amount of expenses over the period of two years. The second graph of the dashboard, Profit Plotted against Market and State shows the sum of profit, which is collected by the company from the respective market size and the states of the country. The highest amount of profit is collected form the state of California. This shows that the people of California uses the products of the company more than any other state in the United States of America. The least amount of profit is collected from the state of New Mexico. Almost below the mark of 1000. All other states can be said to help the company collect a generous amount of profit. The third graph of the dashboard is the Sales Plotted against Market and State, which goes on to show the plotting of the total sales collected by the company from the states of the United States of America. The highest sale records has been recorded in the state of California. This can be easily concluded based on the amount of profit which the company is able to collect from the location the sales was to shoot his high. The least amount of sales has been recorded in the state of New Hampshire where it still shows hope of going upwards in the future. On an overview, it can be said that the south market has the lowest sale of the products among the four markets.

## Dashboard 1: Total Expenses, Profit, and Sales Plotted against Market and State

The second dashboard created consists of two graphs Profit Plotted Against Products and Total Expenses Plotted Against Products. The first graph has been plotted with the help of a green coloured bar chart and the second graph has been composed with the help of a red coloured heat map. The process of understanding heat map is with the help of the saturation of colour in the boxes on the map. The boxes are sized and stacked according to the values and coloured accordingly. The darkest boxes are colour in this way due to the fact that they have the highest amount of value. The boxes are coloured in a single hue with the variation in the saturation of the colour. The first graph shows the plotting of the profit which has been collected from the sale of the products around the 20 states. The company would analyse this data to understand their strong points in the market. The highest profit gainer is the Columbian coffee. However, the value of the green tea product came as a surprise as this product can be considered the healthiest of all the drinks with the lowest profit collected over the period of two years. The value is so much low that the value has become negative in nature. The company should analyse their strength in the market with the help of their profit collected from the products being sold. The second graph shows the plotting of the total amount of expenses which the company is putting in in the production of the different products. The darkest of the boxes is of the Columbian coffee and the lightest is of the regular espresso. From the previous graph and this graph it can be said that the company is making the right move in putting in extra effort and making the product of Columbian coffee the best in the market. However the expense behind the product of green tea is high the profit collected is lower than zero.

The third dashboard which has been created consists of four charts: Budget Cost of Goods Plotted against Products, Budget Margin Plotted against Products, Budget Profit Plotted against Products and Budget Sales Plotted against Products. The charts have been selected to show the forecasting of the company with respect to the products. This means that the company is determining for the future the amount of sale they will be producing for the products they produce. The charts have been designed with the help of column chart, area chart, heat map and Gantt chart. The process of understanding heat map is with the help of the saturation of colour in the boxes on the map. The darkest boxes are colour in this way due to the fact that they have the highest amount of value. The boxes are coloured in a single hue with the variation in the saturation of the colour. From the first chart it can be determined that the Columbian coffee will be the top most priority for the company’s future in terms of making the cost of the product. Alongside this the regular espresso has been found to contribute the least in the prediction for the cost of goods category. For the budget margin chart it can be again seen that that the Columbian coffee has the highest value and the regular espresso with the least value. A similar trend is also seen in the budget profit for the products in case of Columbian coffee. However the least in this case is green tea which is similar to the previously found profit which was the lowest for green tea. Again the previous trend returns in case of the fourth chart od Columbian coffee at the highest and the regular espresso at the lowest pit of the chart in case of plotting of the predicted sales for  the products. This similar following of the trend shows that the company has a determined mind set which is to follow the product trend in this chart and make the best profit possible form the market conditions.

The fourth dashboard consists of the graphs depicting Budget Cost of Goods Plotted against Locations, Budget Margin of Goods Plotted against Locations, Budget Profit of Goods Plotted against Locations and Budget Sales of Goods Plotted against Locations. This dashboard has been developed following a unique rule of plotting the data on a geographic map which makes the visualization interactive in nature. As the states which has been used in the data set is from the United States of America, the dashboard is showing the chart for location of United States of America only. The colouring trend is similar to that of the heat map. A single hue of colour is used in the map sections and the saturation of the colour is based on the values of the locations which is being plotted. For the first chart it can be seen that the state of California has the highest budget for the cost of goods. The location being tropic and a coastal area in nature the goods required for the production of the products require a large amount of investment. Other than that the states in the middle of the country is seen to be having a medium hue suggesting that they have lower amount to be incurred for the cost of goods. In the next chart it can be again seen that the state of California has the highest amount of margin of goods cost for the production line. The lowest can be seen to be in the state of New Mexico. The third chart being a heat map shows the profit of the goods and products on the states to be the highest in Illinois. California closely follows the state of Illinois. Again the lowest producer of the profit margin is the state of New Mexico. The last chart is of a geographic chart which shows the prediction of the sale of the goods and the products in the locations. The darkest of the colour seems to be in California and the lightest in New Mexico. The similar trend in the budget prediction in the countries shows the integrity that is followed in doing a business.

