1) Jet Blue Airlines examined the bags of 80 passengers and found that 20% of the bags were overweight.
a) Based on this sample, what is the 95% confidence interval for the proportion of bags that are overweight?
b) What is the minimum sample size the airline would need to estimate with 95% confidence to obtain a margin of error of +/- 3% for this estimate of the percentage of overweight bags?
2) A factory recently took a sample to assess the quality of its candy output, looking at three different types of candy, and how many of each type of candy were damaged during the manufacturing process:
Candy
|
#damaged
|
Total # candiescounted
|
Apple hardcandy
|
15
|
50
|
Chocolatechew
|
18
|
50
|
Nutcluster
|
30
|
100
|
The factory management would like to determine whether the proportion of candy that is damaged is different for these three types of candy.
a) Construct a contingency table for these data.
b) Is the proportion of candy that is damaged different for these three types of candy? (Calculate the appropriate statistic, give the p-value, and state your conclusion.)
3) A manufacturer of headphones is interested in the sales of a particular headphone model in its stores in 8 airports. Some of these stores are located on the West and some on the East coast of the U.S. Also, the manufacturer recently conducted an advertising campaign. The sales before and after the advertising campaign, which it ran in February using billboards in the airports, are shown below (i.e., data for sales in those stores in January and data for sales in the same stores for March.)
(Some descriptive statistics have also been provided in the table. You will need to decide which ones you need for your calculations in answering the questions below.)
Store
|
Location
|
Sales inJan
|
Sales inMarch
|
Change insales
|
1
|
Eastcoast
|
195
|
230
|
35
|
2
|
Eastcoast
|
220
|
240
|
20
|
3
|
Eastcoast
|
220
|
250
|
30
|
4
|
Eastcoast
|
245
|
265
|
20
|
5
|
Westcoast
|
130
|
157
|
27
|
6
|
Westcoast
|
130
|
140
|
10
|
7
|
Westcoast
|
80
|
99
|
19
|
8
|
Westcoast
|
185
|
207
|
22
|
Summarystatistics
Allstores
|
Mean
|
175.63
|
198.50
|
22.88
|
SD
|
56.72
|
59.65
|
7.68
|
Eastcoast
|
|
|
|
Mean
|
220.00
|
246.25
|
26.25
|
SD
|
20.41
|
14.93
|
7.50
|
Westcoast
|
|
|
|
Mean
|
131.25
|
150.75
|
19.50
|
SD
|
42.89
|
44.71
|
7.14
|
To get full points when answering each part below be sure to: calculate an appropriate statistic, state the result of the test, and state your conclusion.
a) Looking at all the stores, is there a difference in sales between January and March?
b) Did the campaign have a different effect on sales for stores on the East coast versus on the West coast?
c) Was there a difference in sales in January for stores on the East coast versus on the West coast?
4) Below are data for 40 houses located in one of two neighborhoods (A or B). (This data is also provided in an Excel spreadsheet on the website for the class. Open the data in SPSS and conduct the analyses required to answer the questions. Be sure to paste output (i.e., tables) from SPSS into your answers where that is requested or else you will lose points.)
Neighborhood
|
AppraisedLand Value
|
Appraised Valueof Improvements
|
SalePrice
|
Has a yard?(yes/no)
|
A
|
56658
|
53806
|
255000
|
no
|
A
|
93200
|
11121
|
422000
|
no
|
A
|
76125
|
78172
|
290000
|
no
|
A
|
28996
|
5864
|
305900
|
no
|
A
|
30000
|
64831
|
118500
|
yes
|
A
|
30000
|
50765
|
93900
|
yes
|
A
|
46651
|
8573
|
191500
|
yes
|
A
|
45990
|
91402
|
184000
|
yes
|
A
|
42394
|
98181
|
168000
|
yes
|
A
|
47751
|
3351
|
169000
|
yes
|
A
|
63596
|
2182
|
208500
|
yes
|
A
|
51428
|
72451
|
264000
|
yes
|
A
|
54360
|
61934
|
237000
|
yes
|
A
|
65376
|
34458
|
286500
|
yes
|
A
|
42400
|
15046
|
202500
|
yes
|
A
|
40800
|
92606
|
168000
|
yes
|
A
|
12170
|
22786
|
375000
|
yes
|
A
|
24637
|
90598
|
169900
|
yes
|
A
|
30600
|
80858
|
135000
|
yes
|
A
|
44730
|
99047
|
176000
|
yes
|
B
|
38979
|
25946
|
140000
|
no
|
B
|
14861
|
59258
|
74900
|
no
|
B
|
14976
|
48957
|
57300
|
no
|
B
|
15244
|
55169
|
87500
|
no
|
B
|
18260
|
59267
|
82000
|
no
|
B
|
16680
|
55525
|
78000
|
no
|
B
|
53421
|
19792
|
175000
|
no
|
B
|
31417
|
99413
|
185000
|
no
|
B
|
32311
|
75343
|
123000
|
no
|
B
|
26817
|
78726
|
108000
|
no
|
B
|
24564
|
66533
|
108000
|
no
|
B
|
24564
|
71149
|
112900
|
no
|
B
|
27640
|
85347
|
106000
|
no
|
B
|
29656
|
78968
|
147500
|
no
|
B
|
13440
|
41177
|
61000
|
yes
|
B
|
45765
|
81227
|
320000
|
yes
|
B
|
16680
|
72867
|
99500
|
yes
|
B
|
17020
|
61935
|
93000
|
yes
|
B
|
25751
|
82259
|
110000
|
yes
|
B
|
25751
|
64568
|
100500
|
yes
|
a) Give appropriate summary statistics (one measure of central tendency and one measure of variation) for each of the 3 variables Appraised Land Value, Appraised Value of Improvements, and Sale Price, calculated separately for neighborhoods A and B. Important: PROVIDE ONLY ONE (APPROPRIATE) CENTRAL TENDENCY MEASURE AND ONE (APPROPRIATE) MEASURE OF VARIATION FOR EACH VARIABLE FOR EACH NEIGHBORHOOD.
b) Based on this data sample, do neighborhoods A and B differ in the number of houses with and without yards? In your answer be sure to calculate an appropriate statistic, state the result of the test, and state your conclusion. (Paste the output from SPSS for the statistical test that you do in your answer, as well as stating your conclusion and writing out the appropriate statistic that supports your conclusion.)
c) Based on this data sample, do houses in neighborhoods A and B have different sale prices? (In your answer be sure to calculate an appropriate statistic, state the result of the test and state your conclusion.) (Paste the output from SPSS for the statistical test that you do in your answer, as well as stating your conclusion and writing out the appropriate statistic that supports your conclusion.)
d) Provide a correlation matrix for Appraised Land Value, Appraised Value of Improvements and Sale Price for neighborhood B only (you will need to split the data to do this - in SPSS under the Data menu use the "split file" command, split by neighborhood, and select "organize output by groups"). In words, explain the meaning of the correlation between Sale price and Appraised Land Value and the meaning of the correlation between Appraised Land Value and Appraised Value of Improvements.
Note: make sure you deselect "split file" after doing this question part, so that you analyzing all the cases for the next two parts.
e) Imagine you are interested in the relationship between house Sale price and Appraised Land Value while controlling for any effects of Appraised Value of Improvements. Conduct a linear regression that allows you to test this relationship (using data for all the houses, i.e., from both neighborhoods). State your conclusion about the relationship, and provide the statistics that support your conclusion.
f) Imagine you are interested in the relationship between house Sale price and Neighborhood, while controlling for any effects of Appraised Land Value and Appraised Value of Improvements on Sale price. Conduct a linear regression that allows you to test this relationship. State your conclusion about the relationship, and provide the statistics that support your conclusion. (Paste your SPSS output for this regression into your answer.)