Hypothesis testing is one of the mainstays of medical research literature. You would be hard-pressed to find a research article in a clinical journal that doesn't contain the results of at least one hypothesis test. So it's important to think critically about hypothesis tests and what they mean--what they can and can't tell us about our data. I've presented three situations below. Write up a brief description of the problem with the conclusion drawn from the hypothesis test.
1. In an observational study of the effect of a physician quality reporting system, the mean age of the patients of physicians participating in the program is 72.675 years, while the mean age of the patients of physicians not participating in the program is 72.950 years. The total sample size is nearly 11,000 patients. The p-value for the hypothesis test that the mean ages are different is 0.001 (Null hypothesis: the difference in mean ages of the two groups is zero; alternative hypothesis: the difference in means is greater than or less than zero). The analyst concludes that the patients of physicians not participating in the program are older than the patients of participating physicians and therefore not a good comparison group for an observational study.
2. A 1999 study looked at the effect of exposure to perchlorate on rats' thyroids. 30 rats were exposed to perchlorate and their thyroids were compared to 30 rats not exposed to perchlorates. Two of the rats exposed to perchlorate were found to have a rare type of thyroid tumor. Zero of the rats not exposed to perchlorate had tumors. The experimenters performed a hypothesis test for the difference in number of tumors and got a p-value of 0.48. They concluded that perchlorate exposure does not cause thyroid tumors in rats.
3. Three psychiatrists studied a sample of schizophrenic and non-schizophrenic people. They measured 77 variables for each subject - religion, family background, childhood experiences etc. They wanted to determine what factors cause people to later become schizophrenic. Using their data they ran 77 hypothesis tests of the significance of the differences between the two groups of subjects, and found 2 significant at the 2% level. They immediately published their findings.