Analysis of Educational Attainment and Gender

Frequency distribution of top-coded number of siblings variable

The first question uses educ and sibs1 (both quantitative variables). educ ranges from 0 to 20 and captures the respondent’s educational attainment in number of years. We recoded sibs into sibs1 where 15 - 34 siblings are top-coded as 15 siblings. Use sibs1 for the rest of this assignment.

1.Generate a frequency table showing the distribution of the top-coded number of siblings variable (sibs1). Briefly describe. Interpret (in words) the relative frequency for having 2 siblings in terms of a ratio of some sort. (2-3 sentences) [5 pts]

2.Generate a histogram showing the distribution of respondent's educational attainment in number of years (educ). Paste your output. Briefly describe. (2-3 sentences). [5 pts]

3.Generate the mean, standard deviation, median and range for respondent education, mother’s education, and father’s education, by gender. Place results in a table and include the table in your response (you can copy/paste output directly from Stata or other software programs). Briefly describe key patterns and whether or not there are any noticeable differences by gender. (~3 sentences) [8 pts]

4.Perform an independent samples t-test to answer the question, do women have, on average, statistically significantly different education than men? Discuss key results. (2-3 sentences) [5 pts]

5.Perform another t-test to examine whether women have, on average, statistically significantly less income than men? Report significance level. Briefly interpret the 95% confidence interval. (2-3 sentences). [5 pts]

If you wish, prior to answering the next two questions, you can use the tabulate command to inspect the categories and relative frequencies for fehire [tab fehire]and eqwlth [tab eqwlth]. (Do not include in your responses.)

6.Analyze the relationship between the gender (sex) and views about female hiring and promotion in the workplace (fehire). Report the chi-square test statistic. Report key results and interpret the significance level (possible levels are “not significant”, p<.05, p<.01, or p<.001). (2-3 sentences) [6 pts]

7.Now, analyze the relationship between gender and the government’s role in reducing income inequality, as measured by the eqwlth variable (7-categories ranging from 1 to 7; 1=govt should reduce income differences; 7=no govt action). The eqwlth captures the level of agreement with the following statement: Government should reduce income differences.

How many males would we expect to respond with “no govt action” under the condition of statistical independence? [2 pts]
Compare the previous answer to what we actually observe for males responding “no govt action.” What does this suggest? (1-2 sentences) [4 pts]

8.Let's analyze bivariate relationships between sibship (sibs1), respondent's education (educ), and parent's education (maeduc and paeduc). Generate a correlation matrix. Paste your output. Briefly interpret. Should we be concerned about potential multicollinearity with a regression model that includes both father's and mother's education? Why or why would we not want to test for that? (3-4 sentences). [8 pts]

9.Next, perform a bivariate regression of education on number of siblings (x1=sibs1). Paste the output. (Each part can be answered in 1 sentence). [10 pts]

Substantively interpret the intercept (Constant).
Substantively interpret the unstandardized beta for the effect of sibs1 on educational attainment.
How was the t-value for the sibs1 beta calculated?
Substantively interpret the t-value.
Based on the regression, how many additional siblings would it take to reduce one's education by exactly one year?

10.Let’s analyze how number of siblings and mother's education are related to educational attainment. This model simply adds mother’s education (x2=maeduc) to our previous model (in Question 9). Paste the output. [5 pts]

Substantively interpret the unstandardized partial coefficients (including statistical significance).

11.Perform a multiple linear regression model of respondent’s income in constant dollars (realrinc) on education and gender. Paste the output. [8 pts]

Substantively interpret the unstandardized partial coefficients and statistical significance. (2-3 sentences)
How does the interpretation from this analysis differ substantively from the t-test we performed in Question 5? (1-2 sentences)

12.Now, we'll run the same model that we just performed in Question 10 except that we’ll now treat mother's education as a categorical variable rather than one measured in number of years of education. macoldeg is a binary variable coded 1 if the respondent's mother received a BA degree or higher and coded as 0 if otherwise. [5 pts]

Run the regression and paste your output.
Substantively interpret the partial coefficient for mother's degree (including statistical significance). (1 sentence)

13.Now, let's use a 4-category measure for mother's education; this one categorizes by highest degree.

What is the reference category for mother's education? [2 pts]
Report and interpret the partial coefficients for mother's education. (1-2 sentences) [3 pts]

14.Finally, let’s add father's education (x3=paeduc) to the model we ran in earlier in Question 10 where we treated mother’s educational attainment as a quantitative variable (maeduc).

In your own words, describe how and why the coefficient for mother's education changed from our earlier model (Question 10). Answer in plain language as if explaining to someone with no statistical background. (1-2 sentences) [4 pts]
Write the prediction equation. [2 pts]
What would be the predicted education for a respondent with an average number of siblings, average mother’s education, and average father’s education? [2 pts]

15.Find the standardized partial coefficients for the previous model. (each part 1-2 sentences)

Which is greater: The magnitude of the effect of number of siblings, mother’s education, or father’s education on respondent education? Why? [3 pts]
What is the predicted educational attainment for someone that is 1 SD above the mean in number of siblings, 2 SDs below the mean in mother’s education and has average father’s education? In your answer, write out the prediction equation. [3 pts]

16.Let’s say that we are analyzing the relationship between X and Y. Even though the relationship between the two variables is linear, we notice that the predicted Y is higher for secondary vs elementary grade students from high-SES families, but lower for secondary vs elementary students from lower-SES families. How is this possible? If we were modeling this relationship in OLS regression, what would we need to include? (2-3 sentences) [5 pts]

Get instant help from 5000+ experts for