Assignment #1: Prepare Descriptive Statistics Data Analysis Plan
Before conducting any statistical analyses, researchers develop a plan for how they will analyze their data to answer their research questions. The purpose of this assignment is to provide an experience developing a descriptive statistics analysis plan. Note: This first assignment is a plan
only; no statistics will be calculated or graphs created. The second assignment will involve carrying out the plan, after receiving feedback from your instructor.
Step #1: Review the provided STAT200 data set file. (Note: This data set will be used for all three of this term’s written assignments).
The provided data set is a subsample of 30 data points from the US Department of Labor’s Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https://www.bls.gov/cex/). Detailed information on the sample and variables is included with the data set file; please carefully review this information to familiarize yourself with the data (Note: This information will be used in Assignment #2 to describe the dataset).
Step #2: Develop descriptive statistics data analysis plan.
Task 1: Develop scenario. Imagine that you are head of a household and have to determine a household budget plan based on the data available from the dataset. For instance, you are a 35 year old single parent with a high school diploma and one child. The "scenario" you describe is just to explain why you might be motivated to do this analysis. It does NOT have to be correct and/or true. As an example:
Task 2: Select variables for analysis that match the scenario developed in Task 1. The data set provides information on household consumption; there are socioeconomic variables and expenditures variables. The socioeconomic variable names start with “SE-” and the expenditure variable names start with a “USD;” all expenditures are in US dollars. All students must use income as one variable. Select two additional socioeconomic variables (one qualitative and one quantitative) and two expenditures for your analysis that match the scenario you
developed for Task 1.
For instance, using the example scenario of a 35 year old single parent with a high school diploma and one child, you could select “income,” “education,” and “number of children” as socioeconomic variables and then pick two household expenditure items to show the distribution of costs and compare that with your income. For this assignment, though, only select variables that are included with our section’s data set (and income, education, and number of children may or may not be in our data set; these are just example variables include to aid in understanding).
When selecting variables, think about the following three questions:
1. Why am I choosing these variables?
2. What interests me about these variables?
3. What do I think will be the outcome?
Answer these questions in the section on the template labeled: “Reason(s) for Selecting the Variables and Expected Outcome(s):”.
Task 3: Determine appropriate measures of central tendency and dispersion for the selected variables. For each quantitative variable, select at least one measure of central tendency and at least one measure of dispersion (Please see table below for list of measures). For the qualitative variable, select one measure of central tendency. When determining the measures of central tendency and dispersion, think about what is appropriate given the level of measurement and type of variable. Recommend referring to the text and information posted in our LEO classroom to help with this task (Note: you will use this information to provide arationale for your choice of measures)