1. An Australian zoologist is interested in the physiological characteristics of three species of kangaroos. He randomly selects three sanctuaries from all sanctuaries in Australia. He then visits each of these three sanctuaries and obtains a random sample of kangaroos. From his three random samples, he obtains measurements on a total of 148 kangaroos. There were 10 variables collected in total: the species and sex of each kangaroo, and eight physical measurements. All quantitative variables are in units of 10-2 cm. He was unable to record values for some measurements because the kangaroos were not cooperative at the time. The missing values are coded as "NA" in the dataset.
- species kangaroo species with levels fuliginosus, giganteus, melanops
- sex kangaroo sex with levels Female, Male
- palate.length length of oral palate
- palate.width width of oral palate
- nasal.length length of nasal cavity
- nasal.width width of nasal cavity
- mandible.length length of mandible
- mandible.width width of mandible
- mandible.depth depth of mandible
- ramus.height height of ramus (branch of jaw bone)
Use the data file kangaroo.txt to answer the following questions. Please note: all explana- tions must be given in the context of the question.
(a) Identify the population of interest to the zoologist.
(b) Identify the sampling method used in selecting the 148 kangaroos.
(c) Construct a histogram for the variable palate length. Describe the shape of the distri- bution. Also report values of one measure of center and one measure of spread that are appropriate for summarizing the distribution.
(d) Provide an appropriate graphical display for comparing the distributions of nasal length between all three kangaroo species. Discuss any trends. Use and report values of summary statistics to make comparisons between species. Are there any outliers present? Which species has the highest IQR for nasal length?
(e) The zoologist wants to examine the relationship between ramus height and each of the following variables:
(i) nasal width
(ii) mandible width
(iii) mandible depth
Comment on the pattern and strength of relationship for each pair. Which pair has the weakest linear relationship? Provide appropriate statistics and plots to support your answer.
(f) The zoologist wants to predict ramus height from mandible depth for female kangaroos only.
(i) Fit a least squares regression line that predicts ramus height from mandible depth, for female kangaroos only. Write down the equation of the regression line.
(ii) Use the fitted regression line in (i) to predict the ramus height for a female kangaroo with a mandible depth of (I) 120 and (II) 200. Do you have any concerns about these predictions? Briefly justify your concerns (if any).
(iii) If given a value of 600 for the ramus height of a female kangaroo, can we use the fitted line from (i) to predict its mandible depth? If yes, make the prediction. If no, explain why not.
2. Ash would like to upgrade his old fishing rod. There is a new brand of fishing rod called Good rod that he can buy. To test whether Good rod is better than his current rod (i.e., can help him catch more fish), Ash borrowed a demo of the Good rod and tested it out at Lake of Rage (see Figure 1 for the map of the lake). In the map, the darker part in the center of the lake represents deeper water, and that the deeper the water, the fewer but generally larger the fish.
Ash randomly picked ten locations from all possible locations in the lake. At each location, he first randomly picked a rod (the old or the Good one) and used it to fish for 30 minutes. He then fished with the other rod for another 30 minutes.
(a) Identify the sampling method used in selecting the 10 locations.
(b) What is/are the factor(s) for this experiment? State the factor level(s) of each factor.
(c) List all the treatments in this experiment.
(d) What is the response variable?
(e) Explain how randomization is used in this experiment in no more than 30 words.
Are replicates used in this experiment? Justify your answer in no more than 30 words.
Figure 1: Map of Lake of Rage. The stars represent the locations randomly chosen.
(a) This experiment uses a matched pairs design. Explain the benefit of using such a design in the context of this question. Justify your answer in no more than 30 words.
(b) Propose a way to improve the experimental design. Explain in no more than 30 words.
1. In the study in Question 2, Ash created a tally of fish he caught by fish type (Magikarp and Gyarados), rod type (old rod and Good rod), and region of the lake (shallow and deep). The results are as follows:
|
Old rod
|
Good rod
|
Shallow
|
Deep
|
Shallow
|
Deep
|
Magikarp
|
25
|
6
|
31
|
9
|
Gyarados
|
1
|
2
|
2
|
3
|
Total
|
26
|
8
|
33
|
12
|
(a) Construct a contingency table for fish type and region of the lake. Combine data from the two rods. Provide just counts.
(b) Find the conditional distribution of fish type for the deep regions.
(c) Find the marginal distribution of fish type.