Run a proper regression analysis considering the


Assignment - Use Stata to answer the following questions. Save .do files with comments explaining your methods and logic. All answers and necessary tables should be included in the document.

Question 1 -

Suppose you are interested in the impact of some predictors on being in a honors math program (honors; 1=in the honors math; 0 = not in the honors math). The first set of predictors are individual background variables (female, ethnicity, and socioeconomic status). The second set of predictors that you need to enter on the top of the individual background variables are science and reading scores. Finally, the third block of predictors that you want to add includes school type (sctyp; 1 = public, 2 = private) and program (prog; 1 = general, 2 = academic, 3 = vocation). Data for N = 200 students is in Q1honors.csv. Answer the following questions.

a. Run a proper regression analysis (from start to finish, including data exploration, distribution, testing assumptions, testing for missing values, etc.) considering the distribution of an outcome variable with three blocks of predictors and report the results of three models in an APA-formatted table.

b. Formally compare the first model and the third model using a proper test statistic.

c. Using the first model, calculate the probability of being in the honors math program for a Hispanic male whose SES level is high, and test it. Report all results.

d. Give the odds of being in the honors program interpretation for the regression coefficient of female in the best fitting model. Provide rationale as to why you chose that model as the best fitting model.

e. Address multicollinearity issues for the best fitting model among the three models tested

f. Examine outliers, leverage, and influence of the best fitting model after you addressed multicollinearity problem.

Now you'll run a different model for the following questions.

g. Now conduct a proper analysis to answer the following research question: Does the SES effect on the outcome differ across school type while controlling for female, ethnicity, science, and reading scores?

h. Conduct a proper analysis to evaluate if reading scores is a mediator for the impact of science scores on the outcome.

Question 2 -

Long (1990, 1997) investigates factors affecting the research productivity of doctoral students in biochemistry. The response variable in this investigation, art, is the number of articles published by the student during the last three years of his or her PhD Program. The explanatory variables are as follows:

fem Gender: dummy variable - 1 if female, 0 if male

mar marital status: dummy variable-1if married, 0 if not

kid5 Number of children give years old or younger

phd Prestige rating of PhD Department

ment Number of articles published by mentor during the last 3 years

Long's data (on 915 biochemists) are in the file Q2phd.csv.

a. Examine the distribution of the response variable. Based on this distribution, does it appear promising to model these data by linear least-squares regression? Perhaps after transforming the response? Explain your answer.

b. Following Long, perform a Poisson regression of the art on the explanatory variables. What do you conclude from the results of this regression?

c. Perform regression diagnostics on the model fit in the previous question. If you identify any problems, try to deal with them. Are the conclusions of the research altered?

d. Refit Long's model allowing for overdispersion (using a quasi-Poisson or negative binomial model). Does this make a difference to the results?

Attachment:- Assignment Files.rar

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: Run a proper regression analysis considering the
Reference No:- TGS02557512

Expected delivery within 24 Hours