One-way analysis of variance is a method for comparing several population means, when the data are from independent samples. It a test that examines the relationship between a quantitative response variable and a categorical explanatory variable, also called factor that has at least three levels or groups.
1. Determining if ANOVA is Appropriate
In each part below determine if ANOVA can be used to analyze the scenario. IF ANOVA can be used, write the null and alternative hypotheses.
Hint: Since ANOVA is used only in scenarios with a quantitative response variable and categorical explanatory variable (or factor) that has at least three levels, you may find it helpful to first identify the response and explanatory variables and determine if they are categorical or quantitative.
a. A psychologist wants to determine if there is a relationship between weather (sunny, partly cloudy, heavy rain, etc.) and mood (happy, carefree, tired, etc.).
i. Can ANOVA be used? Explain.
IF ANOVA can be used, write the null and alternative hypotheses:
ii. Ho:
iii. Ha:
b. An executive of a national construction company is trying to determine how to be "low bid" on more estimates in order to get more jobs. She compares the cost of lumber in dollars in different regions of the United States, specifically, the North, South, East, and West to generate more accurate estimates.
i. Can ANOVA be used? Explain.
IF ANOVA can be used, write the null and alternative hypotheses:
ii. Ho:
iii. Ha:
c. An HR manager believes that just because employees work long hours does not mean they are productive, agreeing with this article, "Get a Life," from September 24, 2013 in The Economist (https://www.economist.com/blogs/freeexchange/2013/09/working-hours). To test this, the HR manager wants to see if there is a relationship between salary, measured by yearly earnings in dollars, and amount of time worked during the year.
i. Can ANOVA be used? Explain.
IF ANOVA can be used, write the null and alternative hypotheses:
ii. Ho:
iii. Ha:
d. A nutritionist wants to see if grocery buying habits vary based on level of education. The nutritionist compares the average amount spent in dollars on organic food for three different levels of education: high school diploma, college degree, or advanced degree.
i. Can ANOVA be used? Explain.
IF ANOVA can be used, write the null and alternative hypotheses:
ii. Ho:
iii. Ha:
e. Given the recent news of Ebola, the US government is trying to keep everyone living in the US at ease. A government official wants to see if their efforts have been successful. He interviews 100 people of various age groups (18-30, 31-45, 45-60, 60-75, 75+) on how worried they are about contracting Ebola on a scale of 1-5, where 1 is not at all worried and 5 is very worried.
i. Can ANOVA be used? Explain.
IF ANOVA can be used, write the null and alternative hypotheses:
ii. Ho:
iii. Ha:
2. Exploring ANOVA Visually
Below are two dotplots made from random data generated in Minitab. Scenario 1 is on the left and scenario 2 is on the right. The means for data 1, data 2, and data 3 in scenario 1 are all different. Same for scenario 2, the means for data 4, data 5, and data 6 are not equal. Answer the questions below using the dotplots.
a. Is it visually easy to detect a difference in means between data 1, data 2, and data 3 in "scenario 1?" Explain.
b. Is it visually easy to detect a difference in means between data 4, data 5, and data 6 in "scenario 2?" Explain.
c. Why is a difference in means more clear for "scenario 2" than in "scenario 1?"
3. One Way ANOVA Using Software
Open the Hot Dog data set. This data set contains the calories and sodium content of 54 brands of hot dogs for three different types (beef, turkey, and veggie). A nutritionist wants to determine if there is a health benefit to eating hot dogs that are made from ingredients different from the standard hot dogs.
a. Use software to determine the average calorie content for each of the three types of hot dogs.
Minitab Express: Statistics > Descriptive Statistics. Enter the variable calories into the "Variables" box and enter Type into the "Group Variable" box.
Standard _______ Turkey _______ Veggie _________
b. Based on the sample means, how do the three types of hot dogs compare in calorie content. Which of the types would the nutritionist recommend?
c. The nutritionist wants to determine if the three hot types differ statistically in terms of calorie content. Can the nutritionist use ANOVA for this scenario? Why or why not?
d. In words, write the null and alternative hypotheses to test the average calorie content for the three different types of hot dogs.
i. Null Hypothesis:
ii. Alternative Hypothesis:
e. Write the null hypothesis given in part d using appropriate statistical notation.
Hint: Remember that ANOVA compares means. And, we have three types of hot dogs.
f. Use software to complete a one-way analysis of variance F-test to compare average calorie content of three different types of hot dogs. Copy and paste the output.
g. What is the p-value?
h. Based on the p-value, make a statistical conclusion. Do we "fail to reject the null hypothesis" or "reject the null hypothesis?" Explain.
i. By rejecting the null hypothesis, we are saying that at least two of the population means differ. But, which one(s)? Using the 95% confidence intervals for the means in the software output, what conclusion is made?
Hint: The interval plot shown below from Minitab Express may help visualize the 95% confidence intervals.
j. State the real world conclusion.
Hint: Think about what the nutritionist would recommend to his or her client.
4. One Way ANOVA - Interpreting Software Output
The nutritionist thinks calorie content of the three types of hot dogs is a good start to determining a health benefit between the three hot dog types, but calorie content alone, does not tell the entire health story. As a result, the nutritionist also wants to analyze the sodium content.
Use the Minitab Express output to answer the following questions.
a. Using the Minitab Express output, write the null hypothesis in statistical notation.
Hint: Remember that the DF for the group variable is g-1, where g is the number of groups!
b. What is the value of the F-statistic?
c. What is the p-value?
d. If you did not have a p-value and had only the Tukey Simultaneous 95% confidence interval (CI) output as shown below, what conclusion would you make and why? Be sure to explain your answer!
e. Using the p-value, state the statistical conclusion. Should we "reject the null hypothesis" or "fail to reject the null hypothesis?" Explain.
f. State the conclusion in "real world" terms.
g. Using the results from questions 3 and 4, what type of hot dog would you eat? Why?