Create a multiple regression model based on your analysis


The games of the 28th Olympiad were held in Athens, Greece, in the summer of 2004. The data file contains information about 131 of the participating nations, including the following variables.

Medals (Won at Athens Olympics)
Gross Domestic Product ($ Billion)
Population (Million)
Area (Million Sq. Km.)
Infant Deaths Per 1,000 Births
Inflation Rate
Fertility Rate
GDP Growth Rate
Telephones (Million)

The final medal count is perhaps somewhat misleading as a measure of national athletic excellence. For example, the United States won the most medals in absolute terms (Exhibit 1), but isn't even among the top ten in terms of medals per capita (Exhibit 2).

Country              Gold        Silver       Bronze         Total
United States       35            39             29             103
Russia                 27            27             38              92
China                  32            17             14              63
Australia              17            16             16              49
Germany             14            16             18              48
Japan                  16             9              12              37
France                 11             9              13              33
Italy                    10            11             11              32
South Korea          9            12              9               30
United Kingdom     9              9              12              30

Exhibit 1: Top Medal Counts, 2004 Athens Olympics

Country Medals Population Medals/Million Citizens
Bahamas 2 297,477 6.72
Australia 49 19,731,984 2.48
Cuba 27 11,263,429 2.40
Estonia 3 1,408,556 2.13
Slovenia 4 1,935,677 2.07
Jamaica 5 2,695,867 1.85
Latvia 4 2,348,784 1.70
Hungary 17 10,045,407 1.69
Bulgaria 12 7,537,929 1.59
Greece 16 10,665,989 1.50

Exhibit 2: Top Medal Counts per Capita, 2004 Athens Olympics

Add a dummy variable to the data set, representing whether the nation was the host of the Olympics. (Hint: The 2004 Athens Olympic Games were hosted by Greece.) Perform a correlation analysis on all of the variables in the data set, and show the results here in descending order of importance.

Some interesting countries, such as Russia and Jamaica, are excluded from the data set because information was not available for all variables.

1. Add a dummy variable to the data set, representing whether the nation was the host of the Olympics. (Hint: The 2004 Athens Olympic Games were hosted by Greece.) Perform a correlation analysis on all of the variables in the data set, and show the results here in descending order of importance. Explain your results.

2. Show scatter diagrams of the two most important predictors, showing their relationship to "Medals". Use labels to indicate interesting outliers, show a linear trend line, and write something intelligent about your graphs.

3. Create a multiple regression model, based on your analysis above. Try to find the model that has the highest adjusted R-square value for predicting the number of Olympic medals won by a country.

4. Using your model from above, perform a hypothesis test to see whether the true effect of "Area" is less than one medal for every 1,000,000 square kilometers at the 5% level of significance, all other factors taken into account. Find the p-value of your test and explain its meaning.

5. Using your model from above, and assuming the same test, alpha, and variance as in Part 10, what would be the risk of a Type II error if the true effect of "Area" were in fact known to be 0.5 medals per 1,000,000 square kilometers?

6. Using your model from Part 9 above, calculate the residual error in "Medals" for each country. Show the residual errors for the ten countries who most "overperformed" in the Athens Olympics (i.e. won more medals than your model predicted they would) and the ten countries who most "underperformed". Explain your results.

7. Discuss the residual errors in this model. Use charts as appropriate.

8. Do these data provide evidence of a "home field advantage" in the Olympics? In other words, can we conclude that Greece won medals above and beyond what would otherwise been expected because it was the host country?

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Create a multiple regression model based on your analysis
Reference No:- TGS01222836

Now Priced at $40 (50% Discount)

Recommended (94%)

Rated (4.6/5)

A

Anonymous user

2/19/2016 2:25:22 AM

The following assessment is showing information about to participating nation. The games of the 28th Olympiad were detained in Athens, Greece, in the summer of the year 2004. The data file contains information about 131 of the participating nations, as well as the subsequent variables. • Medals (Won at Athens Olympics) • Gross Domestic Product ($ Billion) • Population (Million) • Area (Million Sq. Km.) • Infant Deaths Per 1,000 Births • Inflation Rate • Fertility Rate • GDP Growth Rate • Telephones (Million) The final medal count is perhaps somewhat misleading as a calculate of national athletic excellence. For example, the United States won the most medals in absolute terms, but isn't even among the top ten in terms of medals per capita.