This is a further analysis of the gender pay gap in the Australian population. According to a recent report by KPMG Consulting, gender discrimination continues to be the single largest factor contributing to the gender pay gap (KPMG, 2019). In order to estimate the extent of discrimination in the job market where women with identical labour market characteristics as their male counterparts receive different wages, you will estimate a set of linear regression models.
Since this is an additional analysis on the gender pay gap, the content in the Introduction section of your report may overlap with the one in the Group Assignment. However, you are encouraged to develop/source new background materials. You will use the same dataset as in Assignment 2. The data are drawn from the 2017 Household, Income and Labour Dynamics in Australia (HILDA) survey. The sample used for analysis comprises 824 full-time Australian workers in the age group 20-74. The dataset contains the following information:
1. Before estimating the regression equation, conduct a preliminary analysis of the relationship between workers’ earnings and 1) gender; 2) educational attainment; 3) skill level; and 4) experience. Use tables and/or appropriate graphs for the categorical variables (male, degree, skill) and the continuous variable (experience). Interpret your findings by answering the following questions: how much more/less does a male worker earn compared to a female worker? how much more/less does a degree holder earn versus a non degree holder? How much more/less does a highly skilled worker earn versus a worker who is not highly skilled? What kind of relationship do you observe between workers’ earnings and experience?
2. Use a simple linear regression to estimate the relationship between workers’ earnings (Y) and gender (X) (Model A). You may use the Data Analysis Tool Pack. Based on the Excel regression output, first write down the estimated regression equation and interpret the slope coefficient. Carry out any relevant two-tailed hypothesis test of the slope coefficient using the critical value approach, at the 5% significance level, showing the step by step workings/diagram in your report. Interpret your hypothesis test results.
3. Now use a multiple regression model to explore the relationship of workers’ earnings (Y) with, gender (X1), educational attainment (X2), skill level (X3) and work experience (X4) (Model B). You may use Data Analysis Tool Pack for this. Based on the Excel regression output, first write down the estimated regression equation and interpret the slope coefficients. Carry out any relevant two-tailed hypothesis tests for each individual slope coefficient using the p-value approach, at the 5% significance level. Carry out an overall significance test using the p-value approach. Carefully interpret your hypothesis test results.
4. Interpret the R-squared in Model A and adjusted R-squared in Model B. Which one is a better model? Why?
5. Compare the coefficient of gender in Model A and Model B. Explain carefully why the results are different, relating your discussion to gender discrimination.
6. Predict the earnings of a male worker who has a university degree, is highly skilled and has 10 years of work experience. Next, predict the earnings of a female worker with the same characteristics.
7. If you could request additional data to study the factors that influence workers’ earnings, what extra variables would you request? Discuss two such variables, explaining why you choose them and how each of your proposed variables could be measured in the regression model.