Learning Objective: Practice writing correctly about multiple regressions. The Assignment: Run two regressions and interpret the findings. Both regressions should have the same dependent variable. The first regression should include your key independent variable and no controls. The second regression should include the key independent variable included in the first regression, plus at least three control variables. You can use either the PSID or the GSS. Obtain a data extract from the Panel Study of Income Dynamics (PSID) or the General Social Survey (GSS), please see instructions at the end of this document for instructions for obtaining a PSID or GSS data extract. Here is a list of the variables you will need and the allowed types: 1 dependent variable (Y): continuous 1 key independent variable (X): continuous or binary 3 control variables (C1-C3): continuous or binary (at least one of each) For this assignment, your analysis sample must include at least 500 observations. Drop those observations that have missing values for any of your key variables (dependent, independent, or control variables). Use theoretical reasoning to choose your models and your sample. You do not need to do any reading or research about theories in your topic area; just use reasoning. Think about why, theoretically, the key independent variable you chose could influence your dependent variable. Make sure to pick an independent variable that you expect to have an effect on the dependent variable. Avoid reverse causality concerns to the best of your ability; that is, if you want to study the relationship between education and income, don’t let education be the dependent variable because people generally choose their education and then have an income that reflects their education, not the other way around. Before choosing what control variables to include in your analysis, think about what confounding factors could affect the relationship between your primary independent and dependent variables. For example, if you are interested in the relationship between income and education, then relevant confounding factors are age and characteristics of individuals that are known to be subject to discrimination (gender, race/ethnicity). Finally, restrict your sample to the most relevant observations while keeping in mind that you need at least 500 observations. For example, if your dependent variable is wage income, restrict your sample to working age adults. You will be testing 2 hypotheses in this report. In each case, the null hypothesis is no relationship. Hypothesis 1: The key independent variable is significantly related to the dependent variable in an unadjusted model (no controls). Hypothesis 2: The key independent variable is significantly related to the dependent variable after adjusting for at least 3 control variables. You will turn in an individual report that includes the following components: Table 1: Descriptive statistics – This table should include all of the dependent, independent, and control variables in your regressions. For each variable, you should show the mean and standard deviation for the sample. For all means and standard deviations, show 2 digits after the decimal point (even if they are 0s). A template for what the table should look like is shown at the end of these instructions. All variables and samples in the table should be labeled clearly – the table should speak for itself. In addition, document how you transformed each variable in an appendix or with footnotes or table notes. Table 2: Regression results – This table should present the results of two regressions. For each regression, include all coefficients and corresponding standard errors, adjusted R2, and the sample size. All coefficients, standard errors, and adjusted R2 should show 3 digits after the decimal point (even if all digits are 0s). A template for what the table should look like is shown at the end of these instructions. All variables should be labeled clearly. Significance should be indicated with asterisks and a note should indicate what the asterisks mean. 1-2 paragraphs describing the data and sample you selected to study, and the variables listed in Table 1. BRIEFLY explain your hypotheses. Indicate the significance level you use to evaluate your results. 1-2 paragraphs interpreting the regression results shown in Table 2. Interpret the estimated slopes on your key independent variable and whether the estimated slopes are consistent with your hypotheses. Be sure to show us that you know how to interpret the estimated coefficients for at least one (statistically significant) continuous variable and at least one (statistically significant) binary variable.[1] Describe each regression in terms of its adjusted R2 as well. Finally, include your Stata .log file as an appendix to your report. Make sure you provide lots of comments in your .do file so it is easy to follow what you are showing in your .log file. [1] If you have no continuous or binary variable coefficients that are significant, you can say (just to show us that you know how to do it): “If this estimate had been statistically significant, it would have meant that…”