Problem Set: χ2 tests and measures of association
For this module's Problem Set, please complete the problems listed below.
For your write-up, copy and paste the Stata Results into a word-processed document and add your interpretation and conclusion in the places indicated.
For each problem, check the conditions required for the χ2 test, using the expected counts under the null hypothesis.
The first problem asks you to calculate the expected counts and the χ2 test statistic from the formulas and to find the P value using Stata's chi2tail function or SurfStat.
You may, of course, use Stata's tabi command to check those 'hand' calculations.
For the rest of the problems, useStata'stabi command to do the calculations.
Part 1: tumors in male rats exposed and not exposed to electromagnetic fields
treatment
|
1 or more tumors
|
No tumors
|
total
|
control: no exposure
|
16
|
83
|
99
|
2 gauss exposure
|
30
|
70
|
100
|
1. State the null and alternative hypotheses for the χ2 test.
2. Find the expected counts under the null hypothesis.
3. Use the expected counts to check the assumptions needed to use the χ2 distribution to find the P value.
4. Calculate the test statistic from the observed and expected counts.
5. Use Stata's chi2tail function or SurfStat to find the P value.
6. State the conclusion in words.
7. Show that the χ2 test statistic you've calculated is equal to the square of the z test statistic you calculated in part 3, question 4 of problem set 7.
Calculation from part 3, question 4:
The null hypothesis that the population mean of number of migraine headaches suffered over a three-week period is at least as large for people who have had eight weeks of acupuncture therapy as for people who have no treatment. Because we would like the data to be able to establish that acupuncture therapy is effective in reducing the number of migraines suffered if it is effective, the alternative hypothesis is the one-sided hypothesis that people receiving acupuncture therapy suffer fewer migraines than people who are not treated. For the sample mean difference of 1.2 fewer migraines suffered by people who received acupuncture therapy, the t statistic was 2.31, associated with a p value of 0.01 for the one-sided alternative hypothesis of interest. Thus, the sample provides evidence that is stronger than rejection of the null at the 0.05 level of significance, and we conclude that people who use acupuncture therapy suffer statistically significantly fewer migraine headaches on average over a three-week period. The 95% confidence interval for the population difference in the mean number of headaches, untreated people minus acupuncture patients, is [0.169, 2.231], which implies that we can be 95% confident that the difference in population means is in that interval.
Here is the Stata command for the χ2 test: tabi 16 83 \ 30 70, row exp chi2
Part 2: intensive treatment for type 1 diabetes: the DCCT study
The data:
378 patients in conventional care group and 348 in intensive care group were monitored for 6 years. At end of the study, 91 patients in the conventional treatment group had developed retinopathy, compared with only 23 in the intensive treatment group.
About 24% of patients receiving conventional treatment but only about 7% of patients receiving the intensive treatment developed retinopathy during year s of the study.
Data: Patients who already had retinopathy at the beginning of the study:
|
progression
|
no progression
|
total
|
conventionaltmt
|
143
|
|
352
|
intensivetmt
|
77
|
|
363
|
1. Fill in the missing values in the table above.
2. Calculate the proportions of patients who progressed in each group.
3. Use the conditional probabilities you calculated in question 2 to estimate the risk difference and the relative risk, considering the conventional treatment as the "exposure."
4. State the null and alternative hypotheses for the χ 2 test in terms of the risk difference and the relative risk.
5. Create an appropriate Stata tabi command to get expected counts to check the assumptions and explain why the χ2 distribution and the normal approximation may be used to find the P value for tests.
Use Stata's chi2tail function to find the P value.
6. Interpret the χ2 test: State the P value from the Stata output and state the conclusion in the context of this problem, using either the relative risk or the risk difference. (Use α = .05)
7. Explain why the normal approximation may be used to find confidence intervals for the risk difference.
Here is the form of the csi command to use for this problem, with the conventional treatment as "exposed" and the retinopathy progression as "cases":
csi 143 77 209 286
Paste your Stata Results here:
You should recognize the P(progression | conventional tmt) and P(progression | intensive) and their difference.
8. Write a short summary for the confidence interval for the risk difference.
9. Write a short summary for the confidence interval for the relative risk, which Stata calls the "risk ratio."