Question 1. The following table presents the total sample size and the percentage distribution of a grouped variable, "Hours of Shopping this Year." Calculate the mean and the variance. Note that you need to find "frequency (fi)", "midpoint (Yi)", and "deviation (di)" for each row before you can get the mean and the variance.
Hours Shopping
|
Percentage
(%)
|
Frequency
(fi)
|
Midpoint
(Yi)
|
Deviation
di = Yi -Y
|
di2
|
di2(fi)
|
1-4
|
5
|
|
|
|
|
|
5-8
|
25
|
|
|
|
|
|
9-12
|
15
|
|
|
|
|
|
13-16
|
30
|
|
|
|
|
|
17-20
|
5
|
|
|
|
|
|
21-24
|
20
|
|
|
|
|
|
Total
|
100.0
|
550
|
|
|
|
|
Question 2. One of the Stat's TA's believes that cats are more popular than dogs. Test the hypothesis using the data within the data tables provided below.
Cats
Popularity
|
Freq
|
Percent
|
Cum. %
|
low
|
15
|
75
|
75
|
high
|
5
|
25
|
100
|
Total: 20
Dogs
Popularity
|
Freq
|
Percent
|
Cum. %
|
low
|
22
|
68.75
|
68.75
|
high
|
10
|
45.45
|
100
|
Total: 32
1: State the research and null hypothesis in symbolic form.
2: Perform a T-test.
3: Find the critical value of T relative to the .05 alpha level.
4: Make a decision relative to the null hypothesis and interpret the result.
Question 3. The following equation is a predicted regression line based on an analysis of a sample of 2,500 people. "Happiness" is the dependent continuous variable measured by a 100 point happiness scale. Income is a dummy variable, 0 for low-income, 1 for middle-income, and 2 for upper-income.
Happiness = a + b * Income
Here is the STATA output:
Happiness
|
Coefficient
|
St. Error
|
t
|
P>t
|
Income
|
22.45
|
4.76
|
7.65
|
.000
|
Constant
|
19.1
|
4.09
|
2.25
|
.029
|
Observations = 2500 F(1, 2498) = 13.32 Prob> F=.0001 R^2 = .217 Adj. R^2= .201
Answer the following questions.
1. Are the variables happiness and income related, and if so, by how much? How do you know this?
2. What is the strength of this relationship?
Happiness = a + b * Income+ b * Health - b * Age
The above equation is a predicted multivariable regression line based on an analysis of a sample of 2,500 people. "Happiness" is the dependent continuous variable measured by a 100 point happiness scale. Health is a continuous measure, scored 0 (poor health) to 10 (excellent health). Income is a dummy variable, 0 for low-income, 1 for middle- to upper-income. Age is a continuous measure and is measured in years.
Here is the STATA.
Happiness
|
Coefficient
|
St. Error
|
t
|
P>t
|
Income
|
15.6
|
3.76
|
3.65
|
.001
|
Health
|
10.4
|
1.77
|
3.09
|
.001
|
Age
|
-5.9
|
.87
|
2.76
|
.01
|
Constant
|
19.1
|
4.09
|
2.25
|
.029
|
Observations = 2500 F(1, 2498) = 9.44 Prob> F= .0001 R^2 = .381 Adj. R^2= .341
Answer the following.
3. Compare the coefficients from this model to the bivariate model above. How are they different, and why do you think they are different? Which is a better model?
4. Write the predicted equations for both the bivariate and multivariate regressions. Use variable names instead of x and y.
5. Your friend is planning on making a New Year's resolution to be happier next year. What would you recommend her to focus on?
Question 4. The following table depicts data about the relationship between hours played per week on a video game (Y) and experience level growth in the game (x).
|
Hours playing video game per week
|
Experience level growth in game
|
Mean
|
S.D.
|
Hours playing video game per week
|
1.00
|
.88
|
21.2
|
3.9
|
Experience level growth in game
|
|
1.00
|
9
|
2.3
|
N = 154
|
|
|
|
|
1. Find the coefficients.
R =
b =
β =
2. Find the constant.
a =
3. Find the equation.
Y ¯=
4. Fill out the table for ANOVA for Regression
|
Sum of Squares
|
Degrees of freedom
|
Mean Squares
|
F-statistic
|
Regression
|
|
|
|
|
Error
|
|
|
|
|
Total
|
|
|
|
|
Interpret the result of the F-statistic: can you reject the null at the alpha level of .05?