Summary description of distance using graphical and


Worksheet for Activity: How far are you away from home?

This activity is designed to:

(1) Get familiar with statistical software

(2) Use the concepts and techniques discussed in Chapter 1 to 3 of the book

Background:

This University is interested in updating its freshmen recruiting strategy. In order to do this, a survey will be conducted to find out how far existing on-campus CMU students are away from their home and the reason(s) students decided to come to this university. For illustration purposes, several sections (n=200) of on campus STA-282QR students were surveyed.

Objectives:

The main objective is to discover why students are coming to CMU and see how far away they live from the main campus. In order to accomplish this, on-campus CMU students will be surveyed. Based on these results, CMU administration will create a recruiting strategy designed to attract the largest number of qualified students.

To meet the main objective, we will collect some accurate data, analyze the data, and provide a brief report with analysis and recommendations consisting of:
(i) Summary description of distance using graphical and numerical methods
(ii) Summary table and graphical displays of the top three reasons for attending CMU.
(iii) Summary table and graphical displays of top three reasons separated by gender to show if there is a noticeable difference between female and male students.
(iv) A summary of conclusion and recommendations for future recruiting strategy

Tasks:

Prior to collecting the survey data, we would normally need to:
(a) Design the survey, including how to measure a student's distance from campus and how to ask the survey questions,
(b) Determine the target population,
(c) Determining an adequate sample size,
(d) Identify how to obtain the sample of students to conduct the survey.

After the data is collected, we need to:
(a) Extract the data from their storage location and input into some appropriate software,
(b) Clean the data: Are there any erroneous data values? Are there any unusual data values? How should we deal with these unusual data values?
(c) Perform the data analysis to address the stated objectives,
(d) Prepare a summary report.

Note that some of the tasks will not be able to be performed at this early stage of the class. They will be done without asking you to do them.

For this project, how do we determine that the sample of students is a representative sample of the target population? First, to illustrate, we know from the registrar that on CMU's main campus 43% are male and 57% are female students.We would like our sample to also be approximately 43% male and 57% female.This would certainly be important if a student's gender was a factor in deciding why to come to CMU. Perhaps males and females come here for different reasons.

Questions based on the contents of Chapters 1,2, 3

1. First, let's see how close our n=200 sample came to the true population (N=20,000). Recall that our population has about 57% females and 43% males. First, choose "Graph". Next, select "Bar Plot with Data". Double click on the "Gender" column. Finally, under type select "Relative Frequency". Leave everything else alone and click on "Compute!" at the bottom. Use the mouse and hold the cursor over the blue female and blue male bars. Report the relative frequencies below:

Female Relative Frequency

Male Relative Frequency

2. For this project, given the background information, do you believe this sample accurately reflects the target population? Explain why or why not.

3. Assume there is student from Canada and they gave their estimate as 800 kilometers. How might this affect the analysis of Miles?

4. In looking at the n=200 data set, answer the following questions:

(a) The type of Data collected here is for (an observational study or designed experiment?)

(b) List two Qualitative or Categorical Variables in this data set:

(c) Is the variable ‘Right-distance' a nominal scale or ordinal scale? _____________ and Why?

(d) Is the variable, ‘Miles' continuous or discrete? ______________________ and Why?

(e) Is the variable ‘Miles' Interval scale or Ratio scale? ________________ and Why?

(f) If you compute the average distance using these data values from n=200, is this average distance a

(i) ‘parameter' or a (ii) ‘statistic'? ANS: ________________ and Why?

5. We will now use descriptive information and graphical displaysto determine the top three reasons for coming to CMU (using Statcrunch)[Go to Stat, Tables, Frequency, Select columns (use SHIFT or Ctrl Key to choose the block of variables "Right Distance" thru "Alumni"). Select Statistics: Choose the first 3 statistics (use Ctrl key), Compute]. You will see the output. Look for the top three reasons to report in (a).

(a) Use table to summarize the frequency of ‘reasons' for choosing the university. What are the top three reasons:

(b) For the top reason you found in (a), fill the follow table to find out the frequency and % of males and females students chose this top reason:

Draw a Bar Graph (n=200 sample) for the Reason ‘Right_Distance' for Female and Male, separately, Copy and Paste both graphs here.

What % of males chose 'Right_Distance' as their reason:

Based on these results, should CMU consider an additional expense by creating separate recruiting brochures for males and females?

In other words, does gender have any influence on selecting CMU because of ‘Right Distance" (It is not too far and not too close to home.)?

(c) Draw a Pie graph for the ‘Grade' variable to see the % of students in each grade level.

6. Construct a histogram for variables ‘Miles' and ‘Miles_1' respectively. Copy and paste them here.

Choose Device Independent Bitmap, OK to paste the image.

(a) Describe the shape of the ‘Miles' distribution (left skewed, symmetric, right skewed): Right skewed

(b) Describe the shape of the ‘Miles_1' distribution (left skewed, symmetric, right skewed): Right skewed

7.

(a) Compare the Mean (Average) Distances between ‘Miles' and ‘Miles_1' below. Which one has larger average? Why?

(b) Compare the Median Distances between ‘Miles' and ‘Miles_1' below. Why are the results different than what you found in part (a)?

(c) If you had the ‘Miles' variable, and we asked to describe the typical CMU student, what statistic (mean or median) would you use to describe this student? Explain why.

8. Construct a histogram of the variable Miles_1for male and female, separately, on the same graph in different panels and paste them here.

(a) Describe the shapes of the distributions for both Miles_1 variables; Female and Male students (left skewed, symmetric, or right skewed):

(b) Looking at the histograms, which distribution (Male or Female) of Miles_1 shows a larger variation:

(c) Now use Stat>Summary Stats>Columns , grouped by gender, and select all of the statistics. Compare the standard deviation of Miles_1 for females and male students to your answer in part (b).

9. Obtain the following summary statistics for both Miles and Miles_1.

(a) Compute an estimate of standard deviation, s, for Miles_1 using Range/6 =

(b) How close is this estimated s to the actual standard deviation of Miles_1:

10. Suppose a student has the Distance of 400 miles.

a. Use the Empirical rule (+ or - 3 Std Dev) based on the information of ‘Miles_1 variable to decide if this is an unusual Distance or not.

This is not unusual because the 400miles is within the (+ or - 3Std Dev)

b. Suppose a student has the Distance of 300 miles. Using the Miles_1 summary statistics, compute the corresponding Z-score (number of standard deviations above the mean) and using the Empirical rule to decide if this is an unusually far distance away from home or not.

11. Obtain the following summaries for Miles_1 for Males and Females separately.

12. Construct the box plot for the variable Miles_1, Copy and paste it here.

a) Based on the box plot, what is the shape of the distribution of the Miles_1?

b) Are there any outliers based on this boxplot?

13. Construct a box plot of the variable Miles_1for male and female, separately, on the same graph in different panels (plot groups for each column) and paste them here. Based on the box plots and the dot plots, describe the shape of the distributions of Miles_1 for Female and Male, respectively.

Solution Preview :

Prepared by a verified Expert
Applied Statistics: Summary description of distance using graphical and
Reference No:- TGS02441033

Now Priced at $70 (50% Discount)

Recommended (91%)

Rated (4.3/5)