Correlation and Regression
Develop a multiple regression model to predict Zinc (Zn) concentration from a set of plausible predictor variables in the Smith's Lake - Charles Veryard Reserves (2017) dataset .
Your development of this model must identify and appropriately deal with:
any variables which are not normally distributed
any predictor variables which are co-linear
any predictors for which the null hypothesis can not be rejected
You should analyse the residuals of your model to identify and comment on systematic trends, non-normality, and unusual data points.
Finally, your report will identify the relationships implied by your model and briefly comment on their environmental significance - do they make sense in the 'real world'?
Extent: One A4 page maximum (one-sided) Minimum margins=1cm, minimum font size=10pt, minimum line spacing=single
Objective: To become familiar and proficient with correlation and regression procedures in R and R Commander, and interpretation plus graphical representation of correlation and regression results.
What you can try:
Prepare a correlation matrix of all (or a subset of! there are a lot) the variables in the Smith's-Veryard dataset (properly transformed!) [including the pairwise p-values option]
Look at a scatterplot matrix of selected variables in the Smith's-Veryard dataset (properly transformed!), and identify an interesting relationship - follow this up with ungrouped and grouped simple linear regression models and relevant scatter plots
Prepare a regression model which predicts a minor or trace element concentration (e.g. P, Cu, Mn, Pb, ...) from a selection of major element of bulk soil properties (e.g. Al, Ca, Fe, pH, ...) and interpret / graph the output.
Supplementary material
Reimann, C., Filzmoser, P., Garrett, R.G., Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R, First Ed. John Wiley & Sons, Chichester, England, 343 pp. - Chapters 11 and 16
Attachment:- Assignment.rar