Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave
StatCrunch Activity - Using CrackerBarrel.txt dataset for understanding various concepts

In this activity you will use the CrackerBarrel.txt data file and StatCrunch to illustrate your understanding of a variety of concepts.

You will likely find this activity more challenging than the MyLab homework for several reasons.  First off, you are working with an actual data set.  You will need to spend a bit of time familiarizing yourself with the variables included in the set.  Often students rely on the help aids or prompting included in algorithmically generated questions.  None of those hints are present in the data set.  Additionally, writing a sentence from scratch is more difficult, and more authentic, than selecting the correct word from a dropdown menu.  Ask if you have questions.  Complete every part of every question to the best of your ability.  

Provide an explanation of your work or thought process to arrive at your solution.  Whenever you use StatCrunch, include screen captures of your output to support your answers.  StatCrunch output alone is not sufficient – you must interpret the results.

Cracker Barrel has 6500 restaurants, each located in close proximity to an interstate highway.  The restaurant’s business strategy is to serve its core customer base: people traveling on the interstate highway system who are looking for a quality dining experience.  Customers generally enjoy the restaurant chain’s menu, atmosphere, and consistency from restaurant to restaurant.  The research division at corporate headquarters selected a random sample of 150 restaurants and computed the annual cost of gasoline at these 150 restaurants by randomly selecting gas stations near each restaurant.  Also reported are the Restaurant Number, Annual Revenue, Geographic Region, Average Cost of Gasoline, Miles from the Interstate, Square Footage, and Annual Increase in Revenue.

1.Bar charts and histograms look similar but have some important differences. 

1.Describe how to decide if a bar chart or histogram is appropriate to display data for a particular variable.

Variable

Display type

Justification

Annual Revenue

 

 

Geographic Region

 

 

Average Cost of Gasoline

 

 

Miles from the Interstate

 

 

Square Footage

 

 

Annual Increase in Revenue

 

 

 

  1. For each variable in the data set, determine whether a bar chart or histogram is most appropriate to display the data.  Be sure to justify your decision. 
  1. Based on your answer to part b, use StatCrunch to display the data for Geographic Region.
  1. Based on your answer to part b, use StatCrunch to display the data for Annual Revenue.
  1. Create the following dotplots for the variable Square Feet:
    1. The default dotplot for all 150 Cracker Barrel restaurants in the sample.
    2. Edit the default dotplot by adding the option to group by Geographic Region.
    3. Based on your dotplot in part b, which geographic region appears to have the largest range?
  1. StatCrunch provides a variety of options to customize a histogram.
    1. Explain the difference between a frequency histogram and a relative frequency histogram.
    2. Create a frequency histogram for the variable Square Feet.  Start the histogram at 0 and use a bin width of 500 sq. ft.
    3. Create a relative frequency histogram for the variable Square Feet, grouping the data by Geographic Region.  Show the histogram for the Mid-Atlantic and South regions.  Start the histograms at 0 and use a bin width of 500 sq. ft.
    4. Based on the graphs you created in part c, do you expect the Mid-Atlantic or South region to have a larger standard deviation for the Square Feet variable?  Explain your answer.
  1. Find the summary statistics for the variable Square Feet.  
    1. To the nearest integer, what is the mean?
    2. To the nearest integer, what is the median?
    3. Are the mean and median close together or far apart?  What does this mean about the distribution?
    4. To the nearest integer, what is the standard deviation?
    5. To the nearest integer, what is the number of square feet that corresponds to a z-score of -1.75?
    6. Potential outliers are those observations that are a distance of more than 1.5 interquartile ranges below the first quartile or above the third quartile (See page 119 in your text).  Use this rule to determine if there are any potential outliers for the variable Square Feet.  Give the Restaurant Number of any potential outliers.
  1. Restaurant #142 is in the Northeast region and has 12404 square feet.  Restaurant #126 is in the South region and has 12391 square feet.  Restaurant #61 is in the Southeast region and has 12390 square feet.  Which of these three restaurants is the largest when compared to other restaurants in the same geographic region?  Explain your approach and show any calculations you use.
  1. The Empirical Rule applies to the Square Feet distribution because it is unimodal and roughly symmetric.  
    1. According to the Empirical Rule how many of the 150 Cracker Barrel restaurants should we expect to have square footage within one standard deviation of the mean?
    2. Using interval notation, write the number of square feet within one standard deviation of the mean.  
    3. Use StatCrunch to sort the data for Square Feet.  Then count the actual number of restaurants falling within one standard deviation of the mean.
    4. Are your answers in parts a and c the same?  Should they be? Explain.
  1. A histogram is useful to determine the shape of a distribution.
    1. Create a histogram for the variable Average Cost of Gasoline using a bin width of .20.
    2. Is the distribution symmetric or skewed to the left or right?
    3. Mean, median, and mode are all measure of center.  Similarly, there are multiple measures of variation.  Which measure of center and variation is most appropriate to use for the variable Average Cost of Gasoline?
    4. Give the values for the center and variation calculate the most appropriate measures of center and variation for the variable Average Cost of Gasoline.
  1. Corporate headquarters defines high-revenue restaurants as those with annual revenue in excess of $21,000,000.  Create a frequency table of high-revenue restaurants by Geographic Region.  Include both frequency and percent of total in the table.  Hint: Build the inequality “Annual Revenue > 21,000,000” in the “Where” feature of the Frequency Table menu.
  1. Recall the restaurant’s business strategy is to serve its core customer base: people traveling on the interstate highway system who are looking for a quality dining experience. In preparation for future expansion, corporate headquarters wants to examine the association between Annual Revenue and other variables.  Consider Average Cost of Gasoline, Miles from the Interstate, or Square Feet as possible independent variables to use in a simple linear regression analysis.
    1. Create a scatterplot using Average Cost of Gasoline as the independent variable and Annual Revenue as the dependent variable.  Find r, the correlation coefficient.  Describe the trend, strength, and shape of the relationship.
    2. Create a scatterplot using Miles from the Interstate as the independent variable and Annual Revenue as the dependent variable.  Find r, the correlation coefficient.  Describe the trend, strength, and shape of the relationship.
    3. Create a scatterplot using Square Feet as the independent variable and Annual Revenue as the dependent variable.  Find r, the correlation coefficient.  Describe the trend, strength, and shape of the relationship.  
    4. Select the model with the highest correlation coefficient from parts a-c to answer this question.  Give the linear regression model.  State the slope, including units.  State and interpret the y-intercept.  Finally show how to use the model by estimating the Annual Revenue for some value of the independent variable.

What variables are in the model?

 

Linear regression model

 

Slope (with units)

 

y-intercept with interpretation

 

Use model to predict Annual Revenue

 

 

support
close