Factors affecting exam performance in Data Analysis
Background
Data Analysis is designed to build a strong foundation for development of statistical literacy and statistical thinking. Both are essential for business success and further studies in business.
Recent research showed that "On average for 15-year-old Australian students, females achieved at a significantly lower level than male students (in mathematics)" (p.2, Buckley 2016)1 .
However, there is also research evidence that "Despite the stereotype that boys do better in math and science, girls have made higher grades than boys throughout their school years for nearly a century" (APA 2014)2.
The Head of School funds a research study to investigate gender differences in, and other factors influencing, exam performance in Data Analysis. Data are collected on the following variables from a sample of 624 students enrolled in the unit in 2015.
• Gender
• Degree Type
• Country of Citizenship
• Lecture Attendance4
You are appointed as the research analyst to examine the data and report key findings to the project officer. The dataset is contained in the file: The project officer sets the following tasks to guide your investigation.
Task 1 (t-tests)
1. Using graphics and statistics, describe the distribution of the final marks. Are there any features that you think are important?
2. In 2014 the average final exam mark was 27.3 out of 60. You want to test at 1% if the average exam mark in 2015 has decreased. (Include all six steps of hypothesis testing.)
3. In order to investigate gender difference in exam performance you conduct separate two-sample t-tests. (Conduct your tests at a = 5%.)
(a) Considering the whole sample of students, test if there is a difference in the average exam performance between female and male students.
(b) Considering single degree students only, test if there is any gender difference in exam performance.
(c) Considering double degree students only, test if there is any gender difference in exam performance.
(d) Present your findings in parts (a), (b) and (c) using appropriate graph(s), and discuss any important observation(s).
Task 2 (Regression)
You plan to develop regression models to investigate the factors that influence students' final exam performance. However, you are not sure how to handle the country of citizenship variable.
After further discussion and based on international standardised tests, you and your project officer believe that East Asian students tend to have stronger maths background, and decide to create three categories: Australian, East Asian and others, to re code this variable.
You also agree that East Asian students should include those from China, Japan, Korea, Taiwan and South East Asian countries, and that students from New Zealand are not classified as Australian.
You conduct a stepwise regression according to the following procedure:
Step 1: Gender only
Step 2: Gender and Degree Type
Step 3: Gender, Degree Type and Country of Citizenship
Step 4: Gender, Degree Type, Country of Citizenship and Lecture Attendance
(a) Present the regression output for each of the four steps in a table. You should outline the regression model, the coefficient values and a summary of the adequacy of the model. (4 marks)
(b) Based on the regression output obtained in Step 4, answer the following questions:
• Comment on the overall adequacy of the model.
• For each of the four independent variables, interpret the regression coefficients and comment on their statistical significance.
(c) Describe the changes to the regression coefficient of Gender and its statistical significance when Degree Type is added to the model in Step 2. Considering also the result in Question 2 in Task 1, discuss if there is any gender difference in exam performance.
Task 3 (Further Analysis)
You discuss the findings obtained from Tasks 1 and 2 with the project officer, suspecting that the effects of Gender, Country of Citizenship and Lecture Attendance on exam performance may depend on (or interact with) Degree Type.
The project officer suggests you conduct separate regression analyses for single and double degree students. She also points out that appropriate use of Pivotal Table to analyse the data can also be very insightful.
You are asked to present the findings from your further analysis in a brief report (word limit 500). In this report, you can incorporate relevant findings from Tasks 1 and 2 in your analysis of the factors influencing exam performance in Data Analysis.