Get Instant Help From 5000+ Experts For

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

## Population vs sample and standard deviation calculation

a.Is above a population or a sample? Explain the difference.

b.Calculate the standard deviation of the weekly attendance. Show your workings. (Hint – remember to use the correct formula based upon your answer in (a).)

c.Calculate the Inter Quartile Range (IQR) of the chocolate bars sold. When is the IQR more useful than the standard deviation? (Give an example based upon number of chocolate bars sold.)

d.Calculate the correlation coefficient. Using the problem we started with, interpret the correlation coefficient. (Hint – you are the supermarket manager. What does the correlation coefficient tell you? What would you do based upon this information?)

a.Calculate AND interpret the Regression Equation. You are welcome to use Excel to check your calculations, but you must first do them by hand. Show your workings.

(Hint 1 - As manager, which variable do you think is the one that affects the other variable? In other words, which one is independent, and which variable’s value is dependent on the other variable? The independent variable is always x.

Hint 2 – When you interpret the equation, give specific examples. What happens when Holmes are closed? What happens when 10 extra students show up?)

b.Calculate AND interpret the Coefficient of Determination.

a.What is the probability that a randomly chosen player will be from Holmes OR receiving Grassroots training?

b.What is the probability that a randomly selected player will be External AND be in scientific training?

c.Given that a player is from Holmes, what is the probability that he is in scientific training?

d.Is training independent from recruitment? Show your calculations and then explain in your own words what it means.

A.The company would like to know the probably that a consumer comes from segment A if it is known that this consumer prefers Product X over Product Y and Product Z.

B.Overall, what is the probability that a random consumer’s first preference is product X?

You manage a luxury department store in a busy shopping centre. You have extremely high foot traffic (people coming through your doors), but you are worried about the low rate of conversion into sales. That is, most people only seem to look, and few actually buy anything.

A.During a 1 minute period you counted 8 people entering the store. What is the probability that only 2 or less of those 8 people will buy anything? (Hint: You have to do this by hand, showing your workings. Use the formula on slide 11 of lecture 6. But you can always check your calculations with Excel to make sure they are correct.)

B.(Task A is worth the full 2 marks. But you can earn a bonus point for doing Task B.)

On average you have 4 people entering your store every minute during the quiet 10-11am slot. You need at least 6 staff members to help that many customers but usually have 7 staff on roster during that time slot. The 7th staff member rang to let you know he will be 2 minutes late. What is the probability 9 people will enter the store in the next 2 minutes? (Hint 1: It is a Poisson distribution. Hint 2: What is the average number of customers entering every 2 minutes? Remember to show all your workings.)

There is an apartment up for auction this Saturday, and you decide to attend the auction.

A.Assuming a normal distribution, what is the probability that apartment will sell for over \$2 million?

B.What is the probability that the apartment will sell for over \$1 million but less than \$1.1 million?

A.Since the apartments on Surfers Paradise are a mix of cheap older and more expensive new apartments, you know the distribution is NOT normal. Can you still use a Z-distribution to test your assistant’s research findings against yours? Why, or why not?

B.You have over 2 000 investors in your fund. You and your assistant phone 45 of them to ask if they are willing to invest more than \$1 million (each) to the proposed new fund. Only 11 say that they would, but you need at least 30% of your investors to participate to make the fund profitable. Based on your sample of 45 investors, what is the probability that 30% of the investors would be willing to commit \$1 million or more to the fund?

Population vs sample and standard deviation calculation

Data were collected on the number of passengers at each train station in Melbourne. The numbers for the weekday peak time, 7am to 9:29am, are given below.

1. Construct a frequency distribution using 10 classes, stating the Frequency, Relative Frequency, Cumulative Relative Frequency and Class Midpoint

Required frequency distribution with frequency, relative frequency, cumulative relative frequency and class midpoint is given as below:

From descriptive statistics, we have

Maximum = 7729

Minimum = 169

Range = 7729 – 169 = 7560

Class width = 7560/10 = 756

 Lower Boundary Upper Boundary Mid-point Frequency Cumulative frequency Relative Frequency Cumulative relative frequency 169 925 547 35 35 0.583333333 0.583333333 925 1681 1303 18 53 0.3 0.883333333 1681 2437 2059 3 56 0.05 0.933333333 2437 3193 2815 3 59 0.05 0.983333333 3193 3949 3571 0 59 0 0.983333333 3949 4705 4327 0 59 0 0.983333333 4705 5461 5083 0 59 0 0.983333333 5461 6217 5839 0 59 0 0.983333333 6217 6973 6595 0 59 0 0.983333333 6973 7729 7351 1 60 0.016666667 1 Total 60 1

(All calculations are carried out by using excel)

1. Using (a), construct a histogram. (You can draw it neatly by hand or use Excel)

Part b

Required histogram by using excel is given as below:

1. Based upon the raw data (NOT the Frequency Distribution), what is the mean, median and mode? (Hint – first sort your data. This is usually much easier using Excel.)

Part c

After sorting the data, the median of given sample data is observed as 715. Mode for this data is given as 401. The value for the mean number of passengers is given as 1034 approximately. Descriptive statistics by using excel are provided below:

 Number of Passengers Mean 1033.433333 Standard Error 141.1105456 Median 715 Mode 401 Standard Deviation 1093.037586 Sample Variance 1194731.165 Kurtosis 23.78093092 Skewness 4.21026038 Range 7560 Minimum 169 Maximum 7729 Sum 62006 Count 60

Question 2 of 8

HINT: We cover this in Lecture 2(Measures of Variability and Association)

You are the manager of the supermarket on the ground floor below Holmes. You are wondering if there is a relation between the number of students attending class at Holmes each day, and the amount of chocolate bars sold. That is, do you sell more chocolate bars when there are a lot of Holmes students around, and less when Holmes is quiet? If there is a relationship, you might want to keep less chocolate bars in stock when Holmes is closed over the upcoming holiday. With the help of the campus manager, you have compiled the following list covering 7 weeks:

Weekly attendance Number of chocolate bars sold

472 6 916

413 5 884

503 7 223

612 8 158

399 6 014

538 7 209

455 6 214

1. Is above a population or a sample? Explain the difference.

This is a sample because the data for weekly attendance and number of chocolate bars sold is only given for the 7 weeks. Population data represent complete data of the variables under study. We draw a sample from population by using appropriate random sampling method or any other method. Population is complete enumeration while sample is a subset of population.

1. Calculate the standard deviation of the weekly attendance. Show your workings. (Hint – remember to use the correct formula based upon your answer in (a).)

Here, we have to find the standard deviation of the weekly attendance. Formula for standard deviation is given as below:

SD = sqrt[∑(X – Xbar)^2/(n – 1)]

 No. X (X - mean) (X - mean)^2 1 472 -12.5714 158.040098 2 413 -71.5714 5122.465298 3 503 18.4286 339.613298 4 612 127.4286 16238.0481 5 399 -85.5714 7322.464498 6 538 53.4286 2854.615298 7 455 -29.5714 874.467698 Total 3392 32909.71429 Mean 484.5714

Var = ∑(X – Xbar)^2/(n – 1)

Var = 32909.71429/(7 – 1)

Var = 5484.952381

SD = sqrt(5484.952381)

Standard Deviation = 74.06046436

Calculate the Inter Quartile Range (IQR) of the chocolate bars sold. When is the IQR more useful than the standard deviation? (Give an example based upon number of chocolate bars sold.)

## Calculating Inter Quartile Range and interpretation

From given data, first we have to find the first quartile and third quartile for finding inter quartile range. Interquartile range is useful when there is an outlier exists within the data. Suppose, at one particular day if the number of chocolate bars sold is more due to function nearby store will create an outlier for the data.

The quartiles for the given data for chocolate bars sold are given as below:

 Minimum 5884 First Quartile (Q1) 6014 Median or Second Quartile (Q2) 6916 Third Quartile (Q3) 7223 Maximum 8158

Inter-quartile range = Q3 – Q1 = 7223 - 6014

Inter-quartile range = 1209

Calculate the correlation coefficient. Using the problem we started with, interpret the correlation coefficient. (Hint – you are the supermarket manager. What does the correlation coefficient tell you? What would you do based upon this information?)

The correlation coefficient between the given two variables weekly attendance and number of chocolate bars sold is given as 0.967993. This means there is a strong positive linear relationship or association exists between the two variables weekly attendance and number of chocolate bars sold.

Question 3 of 8

HINT: We cover this in Lecture 3(Linear Regression)

(We are using the same data set we used in Question 2)

You are the manager of the supermarket on the ground floor below Holmes. You are wondering if there is a relation between the number of students attending class at Holmes each day, and the amount of chocolate bars sold. That is, do you sell more chocolate bars when there are a lot of Holmes students around, and less when Holmes is quiet? If there is a relationship, you might want to keep less chocolate bars in stock when Holmes is closed over the upcoming holiday. With the help of the campus manager, you have compiled the following list covering 7 weeks:

472 6 916

413 5 884

503 7 223

612 8 158

399 6 014

538 7 209

455 6 214

1. Calculate AND interpret the Regression Equation. You are welcome to use Excel to check your calculations, but you must first do them by hand. Show your workings.

(Hint 1 - As manager, which variable do you think is the one that affects the other variable? In other words, which one is independent, and which variable’s value is dependent on the other variable? The independent variable is always x.

Hint 2 – When you interpret the equation, give specific examples. What happens when Holmes are closed? What happens when 10 extra students show up?)

For the given regression model, we assume the dependent variable as the number of chocolate bars sold and the independent variable as the weekly attendance because the number of chocolate bars sold are depends upon the weekly attendance of the students at Holmes.

## Calculating correlation coefficient and interpreting the results

Now, we have to find out the regression equation for the prediction of dependent variable or response variable number of chocolate bars sold based on the independent variable weekly attendance. Required regression model is given as below:

 Regression Statistics Multiple R 0.967992639 R Square 0.93700975 Adjusted R Square 0.9244117 Standard Error 224.5951736 Observations 7 ANOVA df SS MS F Significance F Regression 1 3751816.754 3751816.8 74.3773635 0.000346012 Residual 5 252214.9601 50442.992 Total 6 4004031.714 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 1628.688985 605.9000187 2.6880491 0.04339987 71.1734028 3186.204566 Weekly attendance 10.67723382 1.23805051 8.6242312 0.00034601 7.494723665 13.85974397

From above regression model, the correlation coefficient between the two variables weekly attendance and number of chocolate bars sold is given as 0.9680, which means there is a strong positive linear relationship or association exists between the two variables weekly attendance and number of chocolate bars sold. The coefficient of determination or the value of R square is given as 0.9370, which means about 93.70% of the variation in the dependent variable number of chocolate bars sold is explained by independent variable weekly attendance. The p-value for this regression model is given as 0.000346 which is less than the level of significance or alpha value 0.05, so we reject the null hypothesis that given regression model is not statistically significant. This means given regression model is statistically significant. Required regression model is given as below:

Number of chocolate bars sold = 1628.688985 + 10.67723382*Weekly attendance

1. Calculate AND interpret the Coefficient of Determination

The coefficient of determination or the value of R square is given as 0.9370, which means about 93.70% of the variation in the dependent variable number of chocolate bars sold is explained by independent variable weekly attendance.

Question 4 of 8

HINT: We cover this in Lecture 4 (Probability)

You are the manager of the Holmes Hounds Big Bash League cricket team. Some of your players are recruited in-house (that is, from the Holmes students) and some are bribed to come over from other teams. You have 2 coaches. One believes in scientific training in computerised gyms, and the other in “grassroots” training such as practising at the local park with the neighbourhood kids or swimming and surfing at Main Beach for 2 hours in the mornings for fitness. The table below was compiled:

 Scientific training Grassroots training Total Recruited from Holmes students 35 92 127 External recruitment 54 12 66 Total 89 104 193

1. What is the probability that a randomly chosen player will be from Holmes OR receiving Grassroots training?

Here, we have to find P(Holmes or Grassroots)

P(Holmes or Grassroots) = P(Holmes) + P(Grassroots) – P(Holmes and Grassroots)

P(Holmes or Grassroots) = (127/193) + (104/193) - (92/193)

P(Holmes or Grassroots) = 0.72020725

1. What is the probability that a randomly selected player will be External AND be in scientific training?

P(External and Scientific) = 54/193 = 0.27979275

Required probability = 0.27979275

1. Given that a player is from Holmes, what is the probability that he is in scientific training?

Required probability = 35/127 = 0.27559055

1. Is training independent from recruitment? Show your calculations and then explain in your own words what it means.

We know that A and B are independent if P(A and B) = P(A)*P(B)

P(Holmes and Grassroots) = (92/193) = 0.47668394

P(Holmes) = (127/193) = 0.65803109

P(Grassroots) = (104/193) = 0.5388601

## Calculating and interpreting regression equation and coefficient of determination

P(Holmes)* P(Grassroots) = 0.65803109*0.5388601 = 0.3545867

P(Holmes and Grassroots) ≠ P(Holmes)* P(Grassroots)

So, training is not independent from recruitment.

Question 5 of 8

HINT: We cover this in Lecture 5 (Bayes’ Rule)

A company is considering launching one of 3 new products: product X, Product Y or Product Z, for its existing market. Prior market research suggest that this market is made up of 4 consumer segments: segment A, representing 55% of consumers, is primarily interested in the functionality of products; segment B, representing 30% of consumers, is extremely price sensitive; and segment C representing 10% of consumers is primarily interested in the appearance and style of products. The final 5% of the customers (segment D) are fashion conscious and only buy products endorsed by celebrities.

To be more certain about which product to launch and how it will be received by each segment, market research is conducted. It reveals the following new information.

• The probability that a person from segment A prefers Product X is 20%
• The probability that a person from segment B prefers product X is 35%
• The probability that a person from segment C prefers Product X is 60%
• The probability that a person from segment C prefers Product X is 90%

1. The company would like to know the probably that a consumer comes from segment A if it is known that this consumer prefers Product X over Product Y and Product Z.

We are given

P(A) = 0.55

P(B) = 0.30

P(C) = 0.10

P(D) = 0.05

• The probability that a person from segment A prefers Product X is 20%
• The probability that a person from segment B prefers product X is 35%
• The probability that a person from segment C prefers Product X is 60%
• The probability that a person from segment C prefers Product X is 90%

Required probability = 0.55/(0.55+0.30) = 0.647058824

1. Overall, what is the probability that a random consumer’s first preference is product X?

Required probability = 0.55/0.647058824 = 0.85

Question 6 of 8

HINT: We cover this in Lecture 6

You manage a luxury department store in a busy shopping centre. You have extremely high foot traffic (people coming through your doors), but you are worried about the low rate of conversion into sales. That is, most people only seem to look, and few actually buy anything.

You determine that only 1 in 10 customers make a purchase. (Hint: The probability that the customer will buy is 1/10.)

1. During a 1 minute period you counted 8 people entering the store. What is the probability that only 2 or less of those 8 people will buy anything? (Hint: You have to do this by hand, showing your workings. Use the formula on slide 11 of lecture 6. But you can always check your calculations with Excel to make sure they are correct.)

We are given

Sample size = n = 8 and p = 1/10 = 0.1

We have to find P(X≤2)

P(X≤2) = P(X=0) + P(X=1) + P(X=2)

P(X=x) = nCx*p^x*(1 – p)^(n – x)

P(X=0) = 8C0*0.1^0*(1 – 0.1)^(8 – 0)

P(X=0) = 0.43046721

P(X=1) = 8C1*0.1^1*(1 – 0.1)^(8 – 1)

P(X=1) = 0.38263752

P(X=2) = 8C2*0.1^2*(1 – 0.1)^(8 – 2)

P(X=2) = 0.14880348

P(X≤2) = P(X=0) + P(X=1) + P(X=2)

P(X≤2) = 0.43046721 + 0.38263752 + 0.14880348

P(X≤2) = 0.96190821

Required Probability = 0.96190821

1. (Task A is worth the full 2 marks. But you can earn a bonus point for doing Task B.)

On average you have 4 people entering your store every minute during the quiet 10-11am slot. You need at least 6 staff members to help that many customers but usually have 7 staff on roster during that time slot. The 7th staff member rang to let you know he will be 2 minutes late. What is the probability 9 people will enter the store in the next 2 minutes? (Hint 1: It is a Poisson distribution. Hint 2: What is the average number of customers entering every 2 minutes? Remember to show all your workings.)

## Calculating probability and probability distribution

Solution:

Average number of customers per minute = 4

Average number of customers per 2 minute = 2*4 = 8

We have λ = 8

We have to find P(X=9)

P(X=x) = λ^x*exp(-λ) / x!

P(X=9) = 8^9*exp(-8)/fact(9)

P(X=9) = 0.124076917

Required probability = 0.124076917

Question 7 of 8

HINT: We cover this in Lecture 7

You are an investment manager for a hedge fund. There are currently a lot of rumours going around about the “hot” property market on the Gold Coast, and some of your investors want you to set up a fund specialising in Surfers Paradise apartments.

You do some research and discover that the average Surfers Paradise apartment currently sells for \$1.1 million. But there are huge price differences between newer apartments and the older ones left over from the 1980’s boom. This means prices can vary a lot from apartment to apartment. Based on sales over the last 12 months, you calculate the standard deviation to be \$385 000.

There is an apartment up for auction this Saturday, and you decide to attend the auction.

1. Assuming a normal distribution, what is the probability that apartment will sell for over \$2 million?

We are given

Mean = 1.1 million

SD = 385000 = 0.385 million

We have to find P(X>2)

P(X>2) = 1 – P(X<2)

Z = (X – mean) / SD

Z = (2 – 1.1) / 0.385

Z = 2.337662338

P(Z<2.337662338) = 0.990297614

P(X<2) = 0.990297614

P(X>2) = 1 – P(X<2)

P(X>2) = 1 – 0.990297614

P(X>2) = 0.009702386

Required probability = 0.009702386

1. What is the probability that the apartment will sell for over \$1 million but less than \$1.1 million?

Solution:

Here, we have to find P(1<X<1.1)

P(1<X<1.1) = P(X<1.1) – P(X<1)

We are given

Mean = 1.1 million

SD = 385000 = 0.385 million

First we have to find P(X<1.1)

Z = (1.1 – 1.1) / 0.385

Z = 0

P(Z<0) = 0.50

P(X<1.1) = 0.50

Now, we have to find P(X<1)

Z = (1 – 1.1) / 0.385

Z = -0.25974026

P(Z< -0.25974026) = 0.397532068

P(X<1) = 0.397532068

P(1<X<1.1) = P(X<1.1) – P(X<1)

P(1<X<1.1) = 0.50 - 0.397532068

P(1<X<1.1) = 0.102467932

Required probability = 0.102467932

Question 8 of 8

HINT: We cover this in Lecture 8

You are an investment manager for a hedge fund. There are currently a lot of rumours going around about the “hot” property market on the Gold Coast, and some of your investors want you to set up a fund specialising in Surfers Paradise apartments.

Last Saturday you attended an auction to get “a feel” for the local real estate market. You decide it might be worth further investigating. You ask one of your interns to take a quick sample of 50 properties that have been sold during the last few months. Your previous research indicated an average price of \$1.1 million but the average price of your assistant’s sample was only \$950 000.

However, the standard deviation for her research was the same as yours at \$385 000.

1. Since the apartments on Surfers Paradise are a mix of cheap older and more expensive new apartments, you know the distribution is NOT normal. Can you still use a Z-distribution to test your assistant’s research findings against yours? Why, or why not?

Yes, still we can use a Z-distribution to test assistant’s research findings against previous findings, because a sample size selected by assistant is 50 and this sample size is adequate for using normal distribution (n>30) and also we know that the sampling distribution of any sample statistic follows an approximately normal distribution although given sample follows or not follows a normal distribution.

You have over 2 000 investors in your fund. You and your assistant phone 45 of them to ask if they are willing to invest more than \$1 million (each) to the proposed new fund. Only 11 say that they would, but you need at least 30% of your investors to participate to make the fund profitable. Based on your sample of 45 investors, what is the probability that 30% of the investors would be willing to commit \$1 million or more to the fund?

Solution:

We are given

Sample size = N = 45

Number of successes = X = 11

Estimate for proportion = p = X/N = 11/45 = 0.244444444

Total number of investors = n = 2000

30% of 2000 investors = 600 investors

Here, we have to use normal approximation to binomial distribution.

We have to find P(X>600)

Mean = n*p = 2000*0.244444444 = 488.888888

q = 1 – p = 1 - 0.244444444 = 0.755555556

SD = sqrt(n*p*q) = sqrt(2000*0.244444444*0.755555556)

SD = 19.21933182

P(X>600) = 1 – P(X<600)

Z = (X – mean) / SD

Z = (600 - 488.888888) / 19.21933182

Z = 5.781216175

P(Z<5.781216175) = 0.999999996

P(X<600) = 0.999999996

P(X>600) = 1 – P(X<600)

P(X>600) = 1 – 0.999999996

P(X>600) = 0.000000001

Required probability = = 0.000000001

Cite This Work

My Assignment Help. (2020). Statistics And Probability Exercises For Supermarket Data. Retrieved from https://myassignmenthelp.com/free-samples/ha1011-applied-quantitative-methods/cumulative-relative-frequency.html.

"Statistics And Probability Exercises For Supermarket Data." My Assignment Help, 2020, https://myassignmenthelp.com/free-samples/ha1011-applied-quantitative-methods/cumulative-relative-frequency.html.

My Assignment Help (2020) Statistics And Probability Exercises For Supermarket Data [Online]. Available from: https://myassignmenthelp.com/free-samples/ha1011-applied-quantitative-methods/cumulative-relative-frequency.html
[Accessed 23 June 2024].

My Assignment Help. 'Statistics And Probability Exercises For Supermarket Data' (My Assignment Help, 2020) <https://myassignmenthelp.com/free-samples/ha1011-applied-quantitative-methods/cumulative-relative-frequency.html> accessed 23 June 2024.

My Assignment Help. Statistics And Probability Exercises For Supermarket Data [Internet]. My Assignment Help. 2020 [cited 23 June 2024]. Available from: https://myassignmenthelp.com/free-samples/ha1011-applied-quantitative-methods/cumulative-relative-frequency.html.

Get instant help from 5000+ experts for

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost