# STA 2023 : Statistics

• Course Code: STA2023
• University: University Of Florida
• Country: United States

## Question:

1. Before you begin your analysis you are required to take a random sample of size 100from the 171 cases in the data file. Use the Data and Generator.xls to do this. Open this file and go to the first worksheet labelled Macro. Click on Generate and enter the number of rows of data required in the random sample. If you have repeated cases your final sample size may be less than 100. This will not be a problem. Answers to the questions below are to be based on this random sample of up to 100 cases. Make sure you keep a safe copy of your sample since you cannot use the Random Sample Generator to reproduce the first sample.

2. Data List:Provide a printout of the data in your sample, with ID numbers in ascending order.

3. Introduction and Variable List: Provide a summary table listing each variable with informationunder the following column headings: Variable Name, Variable Description (see the information above) and Variable Type (qualitative or quantitative). If the variable is qualitative state whether it is nominal or ordinal; if it is quantitative you need to state whether it is discrete or continuous.

4. Descriptive statistics: Historically the company believed that average salary of male is greaterthan the average salary of female. Determine the mean, median, and standard deviation of current salary data for male and female. Explain the results in the context of the company’s believe. (10 marks)

5. Presentation of data: Draw Histogram and Boxplots for both male and female for current salary Explain the graphs. (10 marks)

PART 2:

6.Hypothesis testing: Use the above information to carry out a t-test to see the averagecurrent salary for male is significantly greater than the average current salary for female.

Explain the result in the light of the question.

7.Comparing gender and positions:

Prepare a Pivot table that shows average current salary for males and females according to their Position within the company. Think carefully about the layout of rows and columns of your table. Place gender in row, positions in column, and current salaries in value box. Then summarise values as average. Explain the results.

8.Comparing Departments: The head of Human Resources wants to compare the structure of thefour departments by gender within the Company.

a.Draw a Pivot table with gender in row, department in column and gender in value.
b.Determine the joint probability for gender and department. Explain the result.
c.Determine the marginal probability and explain the result.
d.Determine the conditional probability by column and explain the result.
e.Determine the conditional probability by row and explain the result.

PART 3:

9.Average salary increase per year: There have been complaints in this company that femaleemployees have been given lower salary increases than their male colleagues.

Examine the issue: Have the salary increases for females been lower?

a). Create a new column, called Length Emplwhich shows the length of time a person has been employed in the company, as of the end of December 2014. Assume that if an employee started in 2008 than they have been employed for 6 years; if they started in 1997 than they have been employed for 17 years, and so on. Create a second new column, called Avg Incr which shows the average increase in salary for each employee since the year they started work for the company.

State the formulae you have used to produce these new columns. Print out the ID column and the new data – 3 columns. This should be in ascending order of the ID numbers.

b). Use Avg Incrto create a histogram showing the distribution of the average salary increase per year. This histogram has been created to get some idea of the distribution of the data. Comment upon the shape of this distribution: Is it symmetric? Is it skewed? If so which way? (5 marks)

c). Now provide side by side boxplots for Avg Incr, split on gender. Again, comment on what you see. Most importantly, provide your answer to the question: “Have the salary increases for females beenlower than those for males?” (10 marks)

d). Run regression analysis with Start Salaryand Current Salary for both male and female separately. Check carefully which variable will be dependent variable and which variable will be independent variable Again comment on what you see. Most importantly, provide your answer to the question:

Have the salary increases for females been lower than those for males?” Comment whether the regression analyses support or oppose the conclusion in (c).

10.Draw a suitable conclusion based on your data analysis.

