Problem 1: We are going to explore the price of regular, unleaded gasoline in the Milwaukee area.
a) What is the population of interest?
b) Find the gas prices of regular, unleaded gasoline at 45 gas stations in the Milwaukee area. You can do this using the suggested websites below or you can drive around and record prices. Note that these websites typically report results in terms of cheapest gas prices first. Please take this into consideration when generating your sample of 45 gas stations, which should theoretically be a random sample.
AAA: https://aaa.opisnet.com/index.aspx (Click on the "Automotive" tab along the top. Then along the left hand side click on "Fuel Prices". On the next screen click on "Launch Finder" under the heading "Fuel Price Finder")
milwaukeegasprices.com or gasbuddy.com
In your Word/PDF/document please indicate the source of your gasoline prices (website used, did you drive around, how did you randomly select 45, etc.).
There is no need to upload your JMP file to D2L. However, your instructor/TA reserves the right to request you to provide this file, so please keep it saved.
c) In JMP, construct a histogram of the regular unleaded prices of gasoline. Describe the distribution by giving its shape, center, and spread according to the histogram.
d) Have JMP produce the following summary statistics:
Mean
Standard deviation
Median
The first quartile, Q1
The third quartile, Q3
Problem 2: We are interested in estimating the mean price of unleaded gasoline in the Milwaukee area. Please answer the following questions:
a) Using the data from Problem 1, have JMP determine the 99% confidence interval for the mean gasoline prices. Report your answer as an interval of prices rounded to two decimal places.
b) Give an interpretation of this confidence interval.
c) AAA lists out that the average price of gasoline in the Milwaukee area last month was $2.51. According to your data, can we say there is a significant difference in the mean gasoline prices compared to last month?
i. State the null and alternative hypotheses.
ii. Describe the assumptions of this hypothesis test to determine if the test statistic you are using is appropriate. Fully explain. Below are the four items you should comment on:
Does the Normality (or non-Normality) of your data set matter? Why or why not?
Is the population standard deviation, σ, known or unknown?
If σ is known, state what it is. If σ is unknown, state what we are using to estimate it.
Which distribution should we use to model probabilities related to the hypotheses?
iii. Determine the p-value using JMP. Below are two suggested ways of doing this:
JMP's Test Mean function
JMP's Distribution Calculator: This can be found via Help >> Sample Data >> Teaching Scripts >> Interactive Teaching Modules >> Distribution Calculator
iv. Make a decision and state your conclusion to the hypothesis test in context of the original problem. Use a significance level of α=0.01 (i.e. 1% significance level).
v. Compare the results of your significance test to the 99% confidence interval for the mean gasoline price per gallon. Does the conclusion in part (iv.) still hold for the confidence interval? Fully explain.
Problem 3: Using the JMP data set HousesProject.jmp, we want to determine if there is a significant difference in the mean price of a 3-bedroom home compared to the mean price of a 4-bedroom home.
a) Give the summary statistics for the price of a 3-bedroom home versus a 4-bedroom home. The easiest way to generate this is to go to Analyze >> Distribution and use "Price" in "Y, Columns" and Use "Beds" in the "By" window.
b) Create a side-by-side boxplot comparison between the price of 3-bedroom versus 4-bedroom homes. The easiest way to generate this is to use Graph Builder. Go to Graph >> Graph Builder and drag "Price" into the Y area and "Beds" into the X area. Then click on the boxplot icon along the top. Comment on the spread of the distributions and also on the medians of the distributions.
c) Is the mean house price for a 3-bedroom home significantly less than the mean house price for a 4-bedroom home?
State the null and alternative hypotheses.
Use JMP to produce an output to test the difference in the means. Identify the appropriate p-value on the output.
Make a decision on the test at a significance level of α=0.02.
State your conclusion to the question above in context.
d) Give the 95% confidence interval from the JMP output you used in part (c).
Problem 4: Using the JMP data set HousesProject.jmp, we want to determine if the size of the house (SquareFeet) can predict the list price (Price) of the home.
a) Produce a scatterplot of Price (y axis) versus SquareFeet (x axis). Describe the form, direction, and strength of the relationship between Price and SquareFeet. Note any potential outliers.
b) Using JMP, estimate the correlation coefficient between Price and SquareFeet.
c) (Optional - worth 0.5 bonus points) Determine the simple linear regression line to predict Price using SquareFeet. In the JMP output is the relationship significant at the 5% level? Justify your answer.
d) (Optional - worth 0.5 bonus points) What is the slope b_1? Give the interpretation of what it means about the Price with respect to SquareFeet.
e) (Optional - worth 0.5 bonus points) Using the regression equation, predict the price of a 2000-square-foot home.
f) (Optional - worth 0.5 bonus points) What percent of the variation in Price can be explained by this regression equation?
Problem 5: In 1912 the British luxury passenger ship Titanic struck an iceberg and sank on its way to New York City. Think of the Titanic disaster as an experiment in how the people of that time behaved when faced with death in a situation where only some can escape, and consider the passengers from the data file Titanic.jmp as a sample from the population of their peers. We want to determine if economic status and survival are independent.
|
Survival Status
|
Economic Status
|
Died
|
Survived
|
Highest
|
117
|
187
|
Middle
|
526
|
186
|
Lowest
|
163
|
112
|
a) State the null and alternative hypotheses.
b) Produce a contingency table output in JMP. In this table, have JMP display the "Count," "Expected," and "Cell Chi Square" values.
c) Give the p-value and the decision from the test at the 5% significance level.
d) What do you conclude from this significance test at the 5% level? State your conclusion in the context of the problem.