Assignment:
1. Use Excel to find the mean of your players' slugging percentages and the mean of your players' salaries. Classify each of your 40 observations as falling into one of the following four categories.
1. Value of slugging percentage is less than or equal to the mean and value of salary is less than or equal to the mean.
(slugging percentage low, salary low)
2. Value of slugging percentage is less than or equal to the mean, but value of salary is greater than the mean.
(slugging percentage low, salary high)
3. Value of slugging percentage is greater than the mean, but value of salary is less than or equal to the mean.
(slugging percentage high, salary low)
4. Value of slugging percentage is greater than the mean and value of salary is greater than the mean.
(slugging percentage high, salary high)
1. Enter into the table the frequencies of each of these categories in your data. The four numbers that you enter in the table should add up to 40.
2. If slugging percentage and salary are independent, then we would expect that salary is equally likely to be high, regardless of whether slugging percentage is high or low. Given the sums of the two rows and two columns that you found in problem 1, what would you expect the table to look like if slugging percentage and salary are independent? Round your answers to the nearest tenth.
3. Using the tables from 1. and 2., test at the 5% significance level the hypothesis that slugging percentage and salary are independent. Your null hypothesis is that they are independent and the alternative hypothesis is that they are dependent.
i) The test statistic follows a chi-square distribution with how many degrees of freedom?
ii) What is the value of the test statistic?
iii) What is the critical value for this test?
iv) Do you reject the hypothesis that slugging percentage and salary are independent at the 5% significance level?
4. In two sentences, describe at least two things you've learned by analyzing your data this semester.
Example for problem 1:
Suppose that my mean slugging percentage was 0.425 and my mean salary was $1,500,000. Then suppose the table below describes slugging percentage and salary for three players:
Cirillo, Jeff IF 0.293 6975000
Davis, Ben C 0.4 1400000
Estrada, Johnny C 0.45 312500
Each of these players goes into the following categories.
Slugging percentage
low Slugging percentage
high Total
Salary low Ben Davis Johnny Estrada
Salary high Jeff Cirillo
Total
For example, Jeff Cirillo goes into the lower-left box because his slugging percentage (0.293) is below the mean and his salary ($6,975,000) is above the mean.
You need to classify each of your forty players by comparing their slugging percentages and salaries to the means for each variable. Be careful to make sure to you use your mean for slugging percentage and your mean for salary, and not the numbers used in this example.
1. Use Excel to compute the mean and standard deviation of your data.
Also, sort your data and write down the 16th value (call that L) and the 25th value (call that H). You will use your values for the sample mean, the sample standard deviation, L and H in the rest of this problem. Assume that the sample standard deviation is the true standard deviation.
2. What is the standard deviation of the mean? Remember that your sample size is 40.
3. You will now test the hypothesis that the true mean is equal to L against the alternative that the true mean is less than L.
i) Write down the null and alternative hypotheses.
ii) Is this a one-tailed test or a two-tailed test?
iii) What is the critical value for this test if ? Please make sure to state your answer as a critical value for salary. Your answer should not be a z- value.
iv) Do we accept or reject the null hypothesis that the true mean is equal to L at the 5% significance level?
4. Now you will test the hypothesis that the true mean is equal to H against the alternative that the true mean is not equal to H.
i) Write down the null and alternative hypotheses.
ii) Is this a one-tailed test or a two-tailed test?
iii) What are the critical values for this test if ? Please make sure to state your answer as a critical value for salary. Your answer should not be a z-value.
iv) Do we accept or reject the null hypothesis that the true mean is equal to H at the 10% significance level?
5. If you accepted the null hypothesis in Q4, what is the smallest significance level at which you would be able to reject the null hypothesis? If you rejected the null hypothesis in Q4, what is the biggest significance level for which you could accept the null hypothesis?
Player's Name |
V1 = Position |
V2 = Slugging Percentage |
V3 = Salary |
Anderson, Marlon |
IF |
0.379 |
600000 |
Belliard, Ron |
IF |
0.426 |
1100000 |
Bennett, Gary |
C |
0.329 |
600000 |
Bruntlett, Eric |
IF |
0.519 |
307500 |
Carroll, Jamey |
IF |
0.372 |
310000 |
Catalanotto, Frank |
OF |
0.39 |
2300000 |
Cirillo, Jeff |
IF |
0.293 |
6975000 |
Clayton, Royce |
IF |
0.397 |
650000 |
DeRosa, Mark |
IF |
0.32 |
725000 |
Everett, Adam |
IF |
0.385 |
370000 |
Gerut, Jody |
OF |
0.405 |
325600 |
Green, Shawn |
IF |
0.459 |
16666667 |
Greene, Todd |
C |
0.508 |
550000 |
Gutierrez, Ricky |
IF |
0.3 |
4166667 |
Guzman, Cristian |
IF |
0.384 |
3725000 |
Hammonds, Jeffrey |
OF |
0.358 |
1000000 |
Jenkins, Geoff |
OF |
0.473 |
8737500 |
Johnson, Charles |
C |
0.43 |
9000000 |
Johnson, Reed |
OF |
0.38 |
318000 |
Kendall, Jason |
C |
0.39 |
8571429 |
Koskie, Corey |
IF |
0.495 |
4500000 |
Marrero, Eli |
OF |
0.52 |
3000000 |
Matos, Luis |
OF |
0.333 |
975000 |
Menechino, Frank |
IF |
0.091 |
400000 |
Merloni, Lou |
IF |
0.426 |
560000 |
Miles, Aaron |
IF |
0.368 |
300000 |
Miller, Corky |
C |
0.026 |
317000 |
Monroe, Craig |
OF |
0.488 |
335000 |
Offerman, Jose |
DH |
0.395 |
500000 |
Perez, Timo |
OF |
0.338 |
850000 |
Reese, Pokey |
IF |
0.303 |
1000000 |
Rollins, Jimmy |
IF |
0.455 |
2425000 |
Schneider, Brian |
C |
0.399 |
350000 |
Sosa, Sammy |
OF |
0.517 |
16000000 |
Thompson, Rich |
OF |
0 |
300000 |
Valentin, Javier |
C |
0.381 |
500000 |
Vizcaino, Jose |
IF |
0.374 |
1200000 |
Werth, Jayson |
OF |
0.486 |
303000 |
Wilson, Enrique |
IF |
0.325 |
700000 |
Zeile, Todd |
IF |
0.356 |
1000000 |