First deliverable for final project -
Your first deliverable should include a word document that has the following parts. You will also separately submit a do-file and log-file that created all of the results.
1) A description of the data set you will use for your primary analysis. This should include a description of the primary data source (the American Community Survey), the year(s) of the data, and the restrictions that I described above. Note: if you are investigating a
"household level variable" (e.g. homeownership, home value, rent,) you should use only the head of the household (pernum==1) for your analysis. If you are investigating an "individual level" variable (e.g. employment, marital status), use both the reference person and his/her spouse
2) A description of any sample restrictions that you are making (e.g. omitted people in certain age ranges, dropped people with missing data, etc.). Make sure that you have "cleaned" your data so that you have the same number of observations on the dependent and control variables. Also, describe any variables that you create or modify. For example, be sure to study the codebook to understand how variables might be coded if there are missing values (drop such observations). You may also want to adjust the units that certain variables are measured in (e.g. housing values might be converted to 1000s of dollars instead of dollars).
3) A description of the dependent variable and the key control variable in your analysis and a discussion of why you believe the key control variable will have either a positive or negative effect on the dependent variable. You should show the relationship between the dependent variable and the key control variable of interest with both a table and a graph. The table and graph should be professionally designed and be self-explanatory.
That is, anyone who looks at the table or graph should be able to determine what statistics are presented, the meaning of the variables, and where the data came from.
4) A description of at least two other control variables that you believe will help explain variation in your dependent variable. Describe why you think the direction of the expected effect of each control variable on the dependent variable would be either positive or negative. A professional table showing for the dependent variable and each control variable the number of observations, the sample mean, standard deviation, minimum and maximum value for the dependent variable and each of the control variables in your data. Be sure to explain any modifications you made to the variables in the data set (e.g. did you have to recode variables that were missing? Did you have to convert a categorical variable to a continuous variable? Did you make dummy variables?)
5) A simple linear regression of your dependent variable on the key variable(s) of interest and other control variables that you believe are important. The results should be presented in a professional table created with the Stata routine esttab. The resultsshould include coefficients, t-statistics, R2 and adjusted R2 , and the number of observations.
6) The structure of your word document should include the following sections.
(a) Introduction.
(i) Describes the objective of the deliverable and a couple of the major findings. For example, this study will use data from the ACS to understand the factors that determine whether people are employed. We find several important determinants of employment. For example, .....
(b) The data
(i) Description of ACS data.
(ii) The dependent variable to be studied, the key control variable, and other controls you think are important.
(iii) Expected effect of each of the control variables and why.
(iv) Sample restrictions for your analysis.
(v) Any modifications you made to the variables that you use in your analysis.
(c) Results of data analysis.
(i) The word document should provide a brief discussion of the results in the tables and figures presented below. All the tables and figures should be numbered and added to the end of your word document.
(ii) Table 1. A professional table showing sample statistics (means, min, max, observations, etc.)
(iii) Figure 1. A professional figure showing relationship between dependent variable and key control variable.
(iv) Table 2. A professional table showing relationship between dependent variable and key control variable.
(v) Table 3. A professional table showing regression results.
Second deliverable for final project -
Along with other students who have chosen the same primary topic, create a single table of regression analysis and discuss the results in the text of your paper. Be sure to correct any problems that were mentioned in the review of your first draft. Your grade on this deliverable will be based on the content of your analysis, but also whether you are able to generate a document that is professional in its appearance and content. Your intended audience is someone who would have the knowledge that is expected of someone who has mastered the content in Economics 311.
1. Provide at least 2 tests of alternative specifications (e.g. log vs linear, linear vs quadratic, dummy variables vs continuous, etc.) In the text, describe the specifications you compared and the preferred specification based on your analysis. Present the results of your regression analysis and the relevant test-statistics in a professional table. Be sure to discuss the results of your analysis in the text.
2. For each specification considered in part (1), provide a Breusch-Pagan test and the simple version of the White test (2nd form discussed in notes) for heteroscedasticity. Include the test statistic and corresponding p-values for these test statistics in your regression table. In the text of your deliverable, describe the basis for the conclusions you draw from your heteroscedasticity tests. If you find heteroscedasticity, are there specific characteristics that cause the variance of the residual to be higher or lower? Explain how you came to this conclusion.
3. If there is evidence of heteroscedasticity, provide standard errors (or t-statistics) that are properly corrected by using robust standard errors. However, if you are estimating a linear probability model, use weighted least squares (instead of robust standard errors) if possible -
but be sure to investigate whether WLS would result in negative weights. If WLS results in negative weights, discuss how you determined this.
4. Discuss whether your expected effects for your key control variable and at least two others that you included in your first deliverable are confirmed by the preferred specification you identified above. Discuss whether these effects are statistically significant at the .05 level.
5. Discuss the "economic significance" of the effects for your two control variables. For example, describe the effect of a one standard deviation change in continuous control variables on the dependent variable; or a switch from 0 to 1 for a dummy variable.
6. In order that your table be deemed professional, review the document posted on my website. The regression table should be self-explanatory. The reader should be able to determine what kind of regression was estimated, how the sample was created, and what all the variables measure without referring to the text.
Examples of appropriate tables are provided in the document "Creating effective tables" that is posted in the Canvas project module. Specific elements of the table that should make the table self-explanatory are as follows:a. Title (make it clear what the table is about - e.g. Determinants of Household Electricity
Expenditures in 2016).
b. Column and row headings (See sample tables for examples of relevant column headers).
c. Notes attached to the table that explain
i. The source of data for the table, including relevant sample restrictions (e.g. households aged 25-55 who have a mortgage).
ii. Whether the table has t-statistics or standard errors in parentheses and the type of regression (e.g. OLS, linear probability model estimated with OLS, etc).
If robust standard errors are used for calculation of standard errors or tstatistics, make that clear. (e.g. t-statistics are in parentheses. Robust standard errors are used in specifications 3 and 4.)
iii. Anything that needs to be clarified about variables or methods can be stated in the list of variables or footnotes to the table. (e.g. income is measured in 1000s of 2016 dollars)
iv. See Miller for examples of how to use notes in tables. Notice that notes in tables are referenced with a letter (not a number).
d. Variable names that are easily understood. Some examples:
i. years of education, not educ
ii. Number of children, not NCHILd
iii. Household Income in 2016 dollars, not income
iv. Be sure units of measurement are clear (e.g. 1000s of dollars, birthweight in pounds).
v. If you are using dummy variables for categories, group them together and make it clear which dummy was omitted).
e. Make it clear what the dependent variable is and also indicate sample size and either R2 or adjusted R2 (or both).
f. Regression tables may also present only a subset of coefficients and mention in a note or row whether other variables were included in the regression. See the sample tables provided on the projects webpage. Miller (table 4.2) provides guidance on the appropriate number of digits. If coefficients are "too small", rescale the relevant control variable to adjust (e.g. measure income in 1000s of dollars instead of dollars).
g. Regression tables should start at the top of the page, and if they must span across pages, be sure to "break" them at a reasonable point (e.g. don't split a row with coefficients on one table and t-statistics on the next).
The sections of your paper should include the following.
1. Title page (title, authors, date)
2. Introduction: Summarize what was learned in first deliverable and what is added in this deliverable.
3. Discussion of results from new regression analysis (Table 1).
a. Different specifications considered and the test statistics you used to decide between the specifications.
b. The implications of the heteroskedasticity tests for each specification (i.e. do you find evidence of heteroskedasticity) and how this affected your decision to use robust standard errors or weighted least squares in each specification.
4. Discussion of whether the key control variable and at least two others fit your prior expectations. Be sure to explain why you expected the effect of the control variable on the dependent variable to be either positive or negative.
Attachment:- Project.rar