1. Waist-to-hip ratio was measured on 8 men just before they entered a weight loss program (time 1) and again 6 months after the program (time 2). The results are below.
Subject 1 2 3 4 5 6 7 8
Time 1 1.03 0.99 1.18 0.82 1.02 1.05 1.06 0.82
Time 2 1.00 1.02 1.12 0.78 1.06 1.00 1.08 0.76
data one;
input WHR1 WHR2;
diff = WHR1 - WHR2;
Datalines;
1.03 1.00
0.99 1.02
1.18 1.12
0.82 0.78
1.02 1.06
1.05 1.00
1.06 1.08
0.82 0.76
;
run;
a. Calculate the difference for each subject (as time 1 value minus time 2 value), then calculate the mean and standard deviation of these differences. You can do this however you like.
b. Using the information from part (a), "by hand", calculate a 90% confidence interval for the difference of the means for the two time periods (i.e., for the mean difference).
c. We want to use a paired t-test, at a = .05, to test if the mean waist-to-hip ratio is smaller at time 2 than at time 1 (i.e., test if the program is effective). Use PROC TTEST to obtain the test statistic and p-value. Turn in the printout.
d. A colleague looks at the printout and decides that we have proven that the program is not effective. Do you agree with his conclusion? Why or why not?
2. The number of heart beats in a 5-minute period are recorded for 25 women not using a certain drug, and 16 women using this drug.
No drug: 293 297 300 302 306 311 312 314 318 320 322 323 324
324 326 327 331 333 335 339 342 345 345 351 356
Drug: 330 331 335 337 339 343 346 349
350 358 364 369 375 387 398 403
data two;
input drug $ beats @@;
datalines;
No 293 No 297 No 300 No 302 No 306 No 311 No 312 No 314
No 318 No 320 No 322 No 323 No 324 No 324 No 326 No 327
No 331 No 333 No 335 No 339 No 342 No 345 No 345 No 351 No 356
Yes 330 Yes 331 Yes 335 Yes 337 Yes 339 Yes 343 Yes 346 Yes 349 Yes 350 Yes 358 Yes 364 Yes 369 Yes 375 Yes 387 Yes 398 Yes 403
;
run;
We want to test, at a = .05, if there is a difference in the mean number of beats for the two groups.
a. Write the null and alternative hypotheses.
b. Analyze the data using PROC TTEST, and turn in the printout. From this, you will first have to test if the population variances are equal. Give the p-value of this test, then briefly state which t-test (i.e., 'Equal' or 'Unequal') is appropriate to compare the means.
c. State the test statistic and p-value of the appropriate t-test, and interpret the result.
3. Use the REACTIONTIME data set. We want to see if the mean reaction time is faster for males than for females.
a. Using a = .05, go through the same steps as in problem #2. Be sure to make the necessary adjustment for this one-sided test.
b. Using the sample sizes and standard deviations from the PROC TTEST printout, calculate the pooled estimate of variance,.
4. Again use the REACTIONTIME data. We want to test, at a = .05, if the average reaction time is the same for the three types of stimuli.
a. In words (not symbols), write the null and alternative hypotheses.
b. Use PROC ANOVA to analyze this data, and include a corresponding Tukey multiple comparison procedure (use the LINES option). Turn in the printout.
c. State the p-value of the F-test, and state what we know only from this test.
d. Summarize the results of the Tukey multiple comparison procedure.
e. State MSERROR, and say what we can estimate with this. Be specific to this situation.
5. The vocabulary size is measured for 11 4-year-old children. The results are below. We want to test, at a = .05, if the population median differs from 450.
149 279 339 388 418 421 441 460 474 487 492
a. Write the null and alternative hypotheses.
b. Perform a sign test, "by hand". Calculate the test statistic, find the p-value (using the Binomial distribution), and interpret the result of the test.
c. Calculate the test statistic for the Wilcoxon signed rank test, "by hand". You do not have to compare this to a rejection region, and you do not have to interpret the result (yet).
d. Run PROC UNIVARIATE to carry out both tests above, and turn in the printout. Give the p-value of the Wilcoxon signed rank test, and now interpret the result.
data five;
input words @@;
datalines;
149 279 339 388 418 421 441 460 474 487 492
;
run;
e. Demonstrate how the test statistics you calculated in parts (b) and (c) relate to the statistics M and S you get from SAS.
6. In problem #4 of homework #5, we have the velocity of fastballs thrown by 10 college baseball pitchers. This is data taken in 2012, and the observations are below.
81.9 77.7 80.3 83.1 89.5 85.4 87.9 88.9 78.1 84.9
Suppose in 1992, the same variable was measured on a sample of 7 college pitchers. The observations are below.
75.6 77.3 82.1 74.1 81.8 76.6 80.3
We want to test, at a = .10, if there is a difference in the median velocity for the two years.
a. Calculate the test statistic of the Wilcoxon rank sum test, "by hand". You do not have to compare this to a rejection region, and you do not have to interpret the result (yet).
b. Use PROC NPAR1WAY (with the EXACT subcommand) to calculate the test statistic. Turn in the printout, state the best p-value to use for the test, and now interpret the result.
data six;
input year n @@;
do i = 1 to n;
input velocity @@;
output;
end;
drop i n;
datalines;
2012 10 81.9 77.7 80.3 83.1 89.5 85.4 87.9 88.9 78.1 84.9
1992 7 75.6 77.3 82.1 74.1 81.8 76.6 80.3
;
run;
7. Again use the REACTIONTIME data. We want to test, at a = .05, if the median reaction time is the same for the three types of stimuli. However, we now want to limit our analysis to females age 65 and older. Use PROC NPAR1WAY (for this analysis do not use the EXACT subcommand) to perform a Kruskal-Wallis test. Turn in the printout, and interpret the result of the test. You will find it useful to include a statement like:
where sex = 'female' and age >= 65;