Assignment:
Task: You work as a consultant for a firm that specializes in workplace health. You have recently been put on a team that has been hired by a company with 80 employees, half men and half women that are considering instituting an employee wellness program as they have specific concerns over obesity and cholesterol levels among their employees. They have provided you with employee health data.
1. The data from your sample of 80 employees can be found on the tab labeled "Health Exam Results Data." The fasting total cholesterol measurement (mg/dL) is recorded as "cholesterol". Levels below 200 mg/DL are considered normal, levels between 200 and 240 (inclusive) are considered above normal/moderate risk, and levels above 240 are considered high risk. In addition to this variable, there are four more variables in this data set: "sex" which can have a value of either "female" or "male"; "age (years)"; "height (inches)"; and "weight (lbs)". The first step to this analysis is to generate some descriptive measures.
For each of the following points, create the chart and/or graph that best displays the data:
a) Show the breakdown of your sample by gender
b) Show the distribution of cholesterol across all participants.
c) Show the distribution of weight across all participants
Additionally, you want to generate some tables of summary statistics.
d) Create one table that calculates summary measures of cholesterol and weight across all 80 participants.
e) Create a second table that calculates summary measures of cholesterol and weight broken out by gender.
Based on the graphs and tables created in parts a-e:
f) What preliminary conclusions can you draw regarding the differences in cholesterol between male and female employees of this company?
2. A healthy cholesterol level is <200. The company wants no more than 25% of its employees to have cholesterol levels that exceed 200.
a. Construct a 95% confidence interval for the proportion of employees who have cholesterol levels higher than 200.
b. Interpret your findings with respect to the company's 25% target.
3. One manager has conjectured that the men have bad dietary habits and probably have high cholesterol readings, which should be the focus of the wellness program. We would like to test the hypothesis that male cholesterol readings are different from female cholesterol readings.
a) Conduct an ANOVA to evaluate whether or not there is a significant difference in cholesterol readings between males and female.
b) Summarize and interpret the results of your test.
4.One manager suggests that the company may be able to check on the progress of the wellness campaign by observing changes in the body mass index instead of redoing the more costly cholesterol reading. The body-mass index is defined as the ratio of the weight to the square of the height, multiplied by 703 if the height is measured in inches and the weight is measure in pounds. BMI = 703*Weight/(Height^2). There is also some discussion that changes in the BMI may be more effective in reducing male cholesterol readings than female cholesterol readings.
a) Create the BMI variable and a dummy variable for sex, called Male, which is zero for females and 1 for males.
b) Estimate a multiple regression model that includes the sex dummy, BMI and an interaction variable between BMI and sex, as independent variables.
c) Calculate predicted values for cholesterol readings for both men and women at BMI values of 25 and 30.
d) Summarize and interpret the results of this model. Is BMI relevant for the program? What do you tell the management team about the relative importance of the BMI for men and women?
5. Shortly after you publish your findings in a report, you receive a call from a small manufacturing company in Mississippi. The company employs 250 workers in a rural area. They too have investigated their employee's cholesterol. However, they are confused because their statistical findings showed nothing significant. What do you tell them and why?
6. A few weeks after you finish your report, your boss knocks on your door. She is very concerned about the data and says "the distribution looks a little off to me. Based on data from the Framningham Heart Study, the 99% Confidence Interval for total cholesterol in the US is approximately 150-350. Single digit total cholesterol is not compatible with life and total cholesterol over 1000 is usually only seen in genetic disorders. Your data is very troubling. Can you investigate further to see what is going on?"
Tell me (a) what you would do next (and why), and (b) what lessons you might take away from this?
Sex |
Age |
Height |
Weight |
Cholesterol |
Female |
17 |
64.3 |
114.8 |
264 |
Female |
32 |
66.4 |
149.3 |
181 |
Female |
25 |
62.3 |
107.8 |
267 |
Female |
55 |
62.3 |
160.1 |
384 |
Female |
27 |
59.6 |
127.1 |
98 |
Female |
29 |
63.6 |
123.1 |
62 |
Female |
25 |
59.8 |
111.7 |
126 |
Female |
12 |
63.3 |
156.3 |
89 |
Female |
41 |
67.9 |
218.8 |
531 |
Female |
32 |
61.4 |
110.2 |
130 |
Female |
31 |
66.7 |
188.3 |
175 |
Female |
19 |
64.8 |
105.4 |
44 |
Female |
19 |
63.1 |
136.1 |
8 |
Female |
23 |
66.7 |
182.4 |
112 |
Female |
40 |
66.8 |
238.4 |
462 |
Female |
23 |
64.7 |
108.8 |
62 |
Female |
27 |
65.1 |
119 |
98 |
Female |
45 |
61.9 |
161.9 |
447 |
Female |
41 |
64.3 |
174.1 |
125 |
Female |
56 |
63.4 |
181.2 |
318 |
Female |
22 |
60.7 |
124.3 |
325 |
Female |
57 |
63.4 |
255.9 |
600 |
Female |
24 |
62.6 |
106.7 |
237 |
Female |
37 |
60.6 |
149.9 |
173 |
Female |
59 |
63.5 |
163.1 |
309 |
Female |
40 |
58.6 |
94.3 |
94 |
Female |
45 |
60.2 |
159.7 |
280 |
Female |
52 |
67.6 |
162.8 |
254 |
Female |
31 |
63.4 |
130 |
123 |
Female |
32 |
64.1 |
179.9 |
596 |
Female |
23 |
62.7 |
147.8 |
301 |
Female |
23 |
61.3 |
112.9 |
223 |
Female |
47 |
58.2 |
195.6 |
293 |
Female |
36 |
63.2 |
124.2 |
146 |
Female |
34 |
60.5 |
135 |
149 |
Female |
37 |
65 |
141.4 |
149 |
Female |
18 |
61.8 |
123.9 |
920 |
Female |
29 |
68 |
135.5 |
271 |
Female |
48 |
67 |
130.4 |
207 |
Female |
16 |
57 |
100.7 |
2 |
Male |
58 |
70.8 |
169.1 |
522 |
Male |
22 |
66.2 |
144.2 |
127 |
Male |
32 |
71.7 |
179.3 |
740 |
Male |
31 |
68.7 |
175.8 |
49 |
Male |
28 |
67.6 |
152.6 |
230 |
Male |
46 |
69.2 |
166.8 |
316 |
Male |
41 |
66.5 |
135 |
590 |
Male |
56 |
67.2 |
201.5 |
466 |
Male |
20 |
68.3 |
175.2 |
121 |
Male |
54 |
65.6 |
139 |
578 |
Male |
17 |
63 |
156.3 |
78 |
Male |
73 |
68.3 |
186.6 |
265 |
Male |
52 |
73.1 |
191.1 |
250 |
Male |
25 |
67.6 |
151.3 |
265 |
Male |
29 |
68 |
209.4 |
273 |
Male |
17 |
71 |
237.1 |
272 |
Male |
41 |
61.3 |
176.7 |
972 |
Male |
52 |
76.2 |
220.6 |
75 |
Male |
32 |
66.3 |
166.1 |
138 |
Male |
20 |
69.7 |
137.4 |
139 |
Male |
20 |
65.4 |
164.2 |
638 |
Male |
29 |
70 |
162.4 |
613 |
Male |
18 |
62.9 |
151.8 |
762 |
Male |
26 |
68.5 |
144.1 |
303 |
Male |
33 |
68.3 |
204.6 |
690 |
Male |
55 |
69.4 |
193.8 |
31 |
Male |
53 |
69.2 |
172.9 |
189 |
Male |
28 |
68 |
161.9 |
957 |
Male |
28 |
71.9 |
174.8 |
339 |
Male |
37 |
66.1 |
169.8 |
416 |
Male |
40 |
72.4 |
213.3 |
120 |
Male |
33 |
73 |
198 |
702 |
Male |
26 |
68 |
173.3 |
1252 |
Male |
53 |
68.7 |
214.5 |
288 |
Male |
36 |
70.3 |
137.1 |
176 |
Male |
34 |
63.7 |
119.5 |
277 |
Male |
42 |
71.1 |
189.1 |
649 |
Male |
18 |
65.6 |
164.7 |
113 |
Male |
44 |
68.3 |
170.1 |
656 |
Male |
20 |
66.3 |
151 |
172 |