Question 1: Calculating a Z-score and graphing a box plot
Look at the data under dietarysupp. The table gives the number of American adults who have used the indicated "nonvitamin, nonmineral, natural products".
a) Use Minitab to find the 5 number summary for this data and graph the box plot
b) What is the mean, standard deviation, median, IQR and range of this data? Does the mean and median have to equal? What is the difference between how the range and IQRare calculated?
c) Find the Z-score of St John's wort.
d) Use the 1.5*IQR to determine if there are any outliers? Are there any outliers using the Z-score (≥ 3 or ≤ -3)?
Question 2: Statistical Inference of Sample Mean
Work with ONLY the sodium column in the Nutrient data (column P). Test whether the population mean, µ of sodium content in food is greater than 280 mg. Let σ = 625 mg and a level of significance of α=0.05. You will need to use a software program (Minitab, Excel) to calculate sample mean, x' and ascertain an n.
a) Write out the steps and draw a conclusion.
b) Report a 99% confidence interval for the sample mean, x'.
Question 3: Binomial Distribution
A virologist has just developed a new flu vaccine to treat a particular resilient strand of the flu. The researcher would like to test her vaccine on patients. The probability of getting the flu with the vaccine is p=0.3 for a given individual. If in a research study she is looking at 45 individuals.
a) What is the probability that 20 people will NOT get the flu, Pr(X= 20)? [Hint: pay attention the definition of success and which p you are going to used]
b) What is the probability that it will work on 2 or more people, i.e. 2 or more individuals will NOT get the flu Pr (X ≥ 2)? [Hint pay attention to how p is defined]
Question 4: Regression Analysis (edunemploy data)
The U.S. Census Bureau published the following data (see edunemploy) on years of education and unemployment rate.
a) Plot the scatterplot ofunemployment rate (y-axis) versus years of education (x-axis). Include the regression line. Describe the direction and strength of the relationship, and comment on whether there are any outliers (visually identified).
b) Determine the equation of the regression line relating y=unemployment rate to x= years of education.
c) What is the value of the slope of the equation? Write a sentence that interprets the slope in the context of these variables. It should start with "Every additional year of education..."
d) Based on the equation, what is the estimated unemployment for someone with 8 years of education? Is x=8 an average or prediction? Comment on the comparison between the actual value and the estimated value. Why are they not equal?