Develop a regression model to investigate


Assignment:

Data Analysis

Background

An American College conducted a study in the early 2000 to examine if there were any gender pay gaps[1] in its four schools: Business, Health, Liberal Studies and Sciences. Data were collected of a sample of 199 academics on their annual salary, years of service, rank, school, gender and age.The data file: Faculty Salary (Research Report Dataset).xlsx is available on Blackboard. A small portion of the data is shown below:

Faculty ID

Age

Years of Service

Rank

School

Gender M/F

Salary ($/year)

1

49

22

ASST

BUSINESS

F

106632

2

31

0

ASST

BUSINESS

F

80000

3

34

2

ASST

BUSINESS

F

114666

...

...

...

...

...

...

...










Legend: There are three academic ranks in the dataset: ASST, ASSO and PROF, which stand for Assistant Professor, Associate Professor and Professor respectively.

Task 1 (Boxplots and t-tests)

1. (a) Construct separate boxplots of salaries for male and female academics, and compare their distributions (central location, spread and skewness).

(b) Test if male academics on average earn more than their female counterparts at 1%.

2. (a) Considering assistant professors only, test if male assistant professors on average earn more than female assistant professors at 1%.

(b) Considering associate professors only, test if male associate professors on average earn more than female associate professors at 1%.

(c) Considering professors only, test if male professors on average earn more than female professors at 1%.

Note: In conducting a test, you should discuss briefly whether it is a one or two tail test, the test statistics, any assumption made and draw a conclusion based on Excel output.

(Bonus Question)

Simpson's Paradox is a type of association paradox. Conduct a Google search to find out more about this topic. Think about the gender gaps observed in Task 1: What is the size of the gender gap in the whole sample? What is the size of the gender gap for each Rank? Discuss if there is any association paradox here.

Task 2 (Regression Analysis)

You plan to develop a regression model to investigate how various factors influence academic salaries.

3. Before you conduct any regression analysis, you use Excel to construct a correlation matrix of all the quantitative variables in the dataset. Based on the correlation matrix, comment briefly on the associations between Salary and other quantitative variables.

4. You conduct a stepwise regression according to the following procedure:

Step 1: Gender only

Step 2: Gender and School

Step 3: Gender, School and Rank

Step 4: Gender, School, Rank and Years of Service

Step 5: Gender, School, Rank, Years of Service and Age

Choice of reference variable: It is recommended that you choose Health and ASSO as the reference variables for the categorical variables: School and Rank.

Present the regression output for each of the five steps.

5. Based on the regression output obtained in Step 4, answer the following:

(a) Which summary measure in the regression output is used to assess the overall adequacy of the model? Comment on the overall adequacy of the model obtained in Step 4.

(b) For each of the four independent variables, fully interpret the regression coefficients and comment on their statistical significance. (In discussing statistical significance of a regression coefficient, you have to justify your choice of one or two tail test.)

6. Based on the correlation between Age and Salary, did you expect Age to have a statistically significant effect on Salary? In Step 5, is the statistical significance of the regression coefficient of Age as expected? Discuss fully.

Task 3 (Summary Report)

Observing the changes to the regression coefficient of Gender and its statistical significance when School, Rank, Years of Service and Age are progressively added to the model in Steps 2 to 5 in Q4 above, discuss in a summary report (word limit 300) if there are any gender pay gaps at this College. In your report, you should integrate all relevant findings from Tasks 1, 2 and 3, and present the final model you would recommend.

In this report, you can include findings from other analyses than Tasks 1, 2 and 3 (for example, Question 2 Tute 1 and the Bonus Question).

Notes:

· Use1 & ½ spacing and font size of 11.

· You can and are encouraged to include relevantcharts andExcel objects in your summary report (Task 3).

· No referencing is required in your summary report. However, if you wish to include, and refer to, additional information, you can use any referencing system as long as it is used consistently.

· There is no word limit for Tasks 1 and 2.

· Theword limit of 300 (with a tolerance of 10%) applies only to the summary report, and is exclusive of words in tables, appendices and reference list (if any).

If you wish to know more about the issue of gender pay gaps in Australian universities, read a special report in The Australian (March 8 2018): "Great progress, but gaps remain" (https://specialreports.theaustralian.com.au/997515/gardner/)

Please note that I have attached the faculty salary( research report data set for this assessment) and the BSB Report CRA

Along with the lecture notes to help show what we are doing and following for this report.

Attachment:- Faculty Salary.rar

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: Develop a regression model to investigate
Reference No:- TGS02071262

Expected delivery within 24 Hours