Question 1.
The text file "xydata5" contains the variables x1, y1, x2 and y2. For this question, hand in your R commands, output and comments together.
(a) (i) Use appropriate R output to find a suitable regression model to predict y1 using x1 as an explanatory variable. Comment on the output used and justify your choice of model.
(ii) Produce the summary output and give the estimated equation for your chosen model.
Use your equation to calculate the fitted value and the residual for an observation with values of x1 = 14 and y1 = 11.6.
(iii) What is the predicted value of y1 when x1 = 14?
(b) (i) Use appropriate R output to find a suitable regression model to predict y2 using x2 as an explanatory variable. Comment on the output used and justify your choice of model.
(ii) Produce the summary output and give the estimated equation for your chosen model. Use your equation to calculate the fitted value and the residual for an observation with values of x2 = 10 and y2 = 1.75.
(iii) What is the predicted value of y2 when x2 = 10?
• For each of these questions we want you to do the full analysis in R and write a Report. Your answers should be in three parts.
First: Technical Notes on the analysis. Refer to the Appendix of your Lecture Work Book and to the Case Studies for examples of Technical Notes. You do not need to quantify confidence intervals in these notes. Some of the information in the Technical Notes may be
repeated in the Executive Summary.
Second: Executive Summary of the main findings of the analysis. Refer to the Appendix of your Lecture Work Book and to the Case Studies for examples of Executive Summaries.
Third: R output. Include all necessary R output used in answering all questions together as an appendix at the end of your assignment. These are for the markers to refer to if you make any mistakes in your analysis, so they can consider giving partial credit. There are no marks allocated for the R output, all the marks are for the Technical Notes and Executive Summary.
Please try to keep this section as small as possible - use the layout.20x() command to save space for multiple plots. Remember: When you cut and paste R output into a word processor, you should use a "fixed" font such as Courier.
• These questions require the same type of Technical Notes and Executive Summaries as the final exam.
Question 2.
Data were collected from an experiment that was conducted to assess the effects of height and other experimental conditions on the weight of wood from young poplar trees. All comparisons should be made to the control treatment. The resulting data is stored in the text file "poplar", which contains the following variables:
Weight the dry weight of wood (in kg)
Height the height of the poplar tree (in metres)
Treatmt experimental conditions that the tree was subjected to:
1 = control
2 = fertilizer
3 = irrigation
4 = fertilizer and irrigation
Question 3.
Data on 102 male and 100 female athletes were collected at the Australian Institute of Sport. It was of interest to predict athletes' lean body mass (LBM) using physical attributes and the sport they played. It was of particular interest to compare lean body mass of swimmers to that of players of all other sports. The resulting data is stored in the text file "aussiesport", which contains the following variables:
Lean the athlete's lean body mass
Height the athlete's height (in cm)
Weight the athlete's weight (in kg)
Skin the sum of the athlete's skin folds
BMI the athlete's body mass index (weight/height)
Sport the sport played:
b.ball = basketball
field = field events
gym = gymnastics
netball = netball
row = rowing
swim = swimming
tennis = tennis
track = running
w.polo = water polo
Analyse the data, hand in your R output and write both Technical Notes and an Executive Summary for the analysis. Your Technical Notes should describe each step of the model building process.