Topic: Statistics with "R" coding - Categorical Variables, Understanding Predictors, and Moderated Multiple Regression
Part I: Categorical Variables
For this part of the exercise, please read in the JobCharsPlus.csv data file. It contains most of the variables from last week's dataset, as well as a couple of new ones. One of the new ones isSomComp. This variable indicates each person's most frequently experienced somatic complaint for the day on which the survey was administered. Here is a table that provides a key to the coded categories:
0 = No complaint
1 = Backache
2 = Headache
3 = Shortness of breadth
4 = Acid indigestion or heartburn
5 = Upset stomach or nausea
6 = Tiredness or fatigue
Using the "No complaint" category as the referent group, please create an appropriate number of dummy coded variables to represent these 7 different somatic complaints. If you're not sure where to start, you'll need to use the ifelse command from EX1 (and EX4) to generate your dummy coded variables.
Once you have these variables, run a linear model (i.e., a regression model using the lm function in R) that examines whether and which somatic complaint might be associated with Counterproductive Work Behavior. How would you interpret the model R2 and its associated F-statistic? If any of the b-weights are significant, could you please provide an interpretation of the effect(s)?
Part II: Understanding Predictors
For this part of the exercise, please read in the TMT.csv file. These data, which are mostly fabricated for use in this class, describe a large sample of members of Top Management Teams. Here are the variables that were collected:
Sex 0 for female; 1 for male
Age In years
SalCat Salary Category (in 4 levels, from low [1] to high [4])
Narc Narcissism (a personality dimension indicating extremely high levels of self-esteem and little empathy for others)
RiskDec A measure of risk-taking with decisions that involve their organizations
OrgTrust Level of trust toward their organization
OrgID Level of organizational identification
JobSat Level of job satisfaction
Examine a linear model that regresses SalCat on OrgTrust, OrgID, JobSat, and Age. Use all the information that we've discussed in class (e.g., R2, adjusted R2, b-weights, β-weights, and other indices of predictor importance available through the yhat library in R) to examine how these predictors compare to each other in their capacity to predict SalCat. What do you conclude? Please include any caveats, hesitations, or alternative perspectives that you might have noticed along the way.
Part III: Moderated Multiple Regression
For this part of the exercise, use the TMT.csv file once again. To understand the connections between Narcissistic personality, life stage, and career success, regress SalCat (a measure of career success) on Narc, Age, and their interaction. Be sure to do everything needed to ensure that you will have the proper interpretation when conducting this moderated multiple regression analysis. What, if anything, can you tell from examining R2, the b-weights, and their significance?
Now load the pequodlibrary (after installing it if you have yet to do so). Use the lmresfunction to re-run the same moderated multiple regression model. As I've not specified which variable is considered the predictor and which variable is considered the moderator, just choose the roles that seem to make sense to you. Then obtain the simple slopes and a plot for the model that you ran also using the relevant functions in pequod (this information is available in the PowerPoint slides and should be accurate). Now reverse the roles for the predictor and moderator variables and produce new simple slopes and plot. Does one plot make more sense than the other? Provide a substantive interpretation of one of the two versions.
Please note that I'll need the code included on the assignment.
Attachment:- Assignment Files.zip