Question 1.
The U.S. government collects data on energy consumption per capita for each state and the District of Columbia. The provided data file show the energy consumption per capita (in million Btu) for the year 2010.
a) Make a histogram of the data.
b) Obtain the boxplot of the data.
c) Find the five-number summary, the mean, and the standard deviation of the per capita energy consumption in the 50 states and the district of Columbia.
d) Use the output to write a brief report on the distribution of energy consumption per capita in the U.S. in the year 2010. Make sure you identify any outlying states in energy consumption usage. How was energy consumption in your home (or adopted) state in 2010?
Minitab Instructions: a), b), and c): We obtain at once histogram, boxplot, and summary statistics by doing the following:
- Choose 'Stat → Basic Statistics → Display Descriptive Statistics'
- Select the Energy Consumption variable for the 'Variables' Box
- Click the ‘Graphs' dialog box open
- Select ‘Histogram of data' and ‘Boxplot of data' and press Enter twice
The output will give you the summary stats (in the session window), histogram, and boxplot (on separate graph windows). There is no need to edit the graphs. Take all of this to your MS Word file and produce a nice description of the output. You may want to create a table such as the one below to report the statistics:
N Min Q1 Median Q3 Max Mean St.Dev.
Energy Consumption
Question 2:
In a study to contrast cholesterol levels in rural and urban settings, the cholesterol levels of 45 urban Guatemalan Indians and 49 rural Guatemalan were measured. The data file is provided separately
a) Describe the individuals and the variables in this study. Specify the quantitative and the categorical variables?
b) Make side-by-side histograms and side-by-side boxplots and obtain the summary statistics for the two groups. . Present the numerical summaries in a table (see instructions below).
c) Use the graphical and numerical summaries to write a brief report comparing tthe cholesterol levels of the two groups.
Minitab Instaructions: Here is what you need to do:
- Choose 'Stat → Basic Statistics → Display Descriptive Statistics'
- Select the Cholesterol variable for the ‘Variables' box
- Check the ‘By variable' box and enter the Group variable
- Click the ‘Graphs' dialog box open
- Select ‘Histogram of data' and ‘Boxplot of data' and press Enter twice.
- The output will consist of the summary statistics for the two groups (found in the session window) and the side-by-side boxplots (found in a separate graph windows). To report the statistics, it's a good idea to create a table such as the one below in MSWord:
- Group N Min Q1 Median Q3 Max Mean St.Dev.
Rural
Urban
Question 3: (Paper and pencil. Show work.)
The WAIS test is an IQ test for the population of young adults (20-34 age group).
The WAIS test scores normally distributed with a mean of 110 and a standard deviation of 25.
a) What proportion of young adults has a WAIS score is above 140.
b) What proportion of young adults has a WAIS score between 90 and 120.
c) Compute the interquartile range (IQR) of the WAIS scores.
d) Find the 99-th percentile of the distribution of WAIS scores.
Question 4:
Question 5:
Refer to the study about grades and self-concept described in Problem 1.43, p. 27, in our textbook.
a) Make normal quantile plots for each of the GPA, IQ, and Self-concept variables.
b) Which distribution is closest to a normal distribution, if any?
Minitab Instructions:: (a) To find the (equivalent of the) quantile plot for the variable GPA, choose 'Graph → Probability Plot'
- Choose the ‘Simple' icon
- Select GPA for the Variables box and press Tab
- Click on the Distribution button
o Select the Data Display
o Deselect ‘Show confidence interval' and press Enter twice.
o Copy and paste the graph
- Repeat the steps above for the IQ and Self-Concept variables.
Recall that linearity supports the normality assumption.
5 .1.43 Grades and self-concept. Table 1.3 presents data on 78 seventh-grade students in a rural midwestern school.19 The researcher was interested in the relationship between the students' "self-concept" and their academic performance. The data we give here include each student's grade point average (GPA), score on a standard IQ test, and gender, taken from school records. Gender is coded as F for female and M for male. The students are identified only by an observation number (OBS). The missing OBS numbers show that some students dropped out of the study. The final variable is each student's score on the Piers-Harris Children's Self-Concept Scale, a psychological test administered by the researcher.
(a) How many variables does this data set contain? Which are categorical variables and which are quantitative variables?
(b) Make a stemplot of the distribution of GPA, after rounding to the nearest tenth of a point.