Research Methods - Data Treatment for Biologists
Part A : Workshop
The aim of this workshop is to give you some experience using standard statistical treatments of data using statistical software. The package we will use for this workshop is Minitab. It is the package used as the standard statistical software by the Mathematics and Statistics courses at RMIT and Chemistry also has a site licence for it. Most of the analyses in this workshop can be done using Excel but Excel is not very user-friendly for statistical analyses. The analyses in Minitab can be done from simple pull-down menus and there is a good on-line help facility. You can copy-and-paste data from Excel into Minitab. For the assessment you should enter your results into the attached pro forma.
Before attempting this assignment you should try the examples in modules 1-4
Case Study 1
18% protein Diet
|
5% protein Diet
|
13.3
|
5.1
|
16.3
|
8.7
|
9.9
|
8.7
|
9.3
|
8.5
|
16.1
|
8.1
|
9.7
|
6.9
|
9.7
|
6.9
|
14.1
|
12.3
|
It is believed that nutritional deprivation affects various components of the immune system, such as the tuberculin skin reactivity. In this study a sample of 8 male rats were fed with a normal diet of 18% protein. Another sample of rats were fed with a diet of only 5% protein. After 4 weeks, the rats were given an interdermal injection of 25µg of purified protein derivative of tuberculin. The above table gives the skin reactivity diameter of erythema and induration (in mm) for the 2 groups
(a) Determine the mean, variance, standard deviation and 95% confidence interval for each data set
(b) verify the assumption that the two populations are (i) normally distributed and (ii) have equal variance
(c) display the data using a box-and-whisker plot
(d) use a t-test to determine if there is a significant difference between the tuberculin reactivity of normal and malnourished rats
(e) in Excel, create a bar graph for each group to compare the means. Include ‘error bars' showing the confidence intervals (there is a sample spreadsheet showing how to construct ‘error bars' in Course Documents, data Treatment folder)
Case Study 2
|
Germinated
|
Did not germinate
|
Total
|
Old Strain
|
125
|
15
|
140
|
New Strain
|
152
|
8
|
160
|
Total
|
277
|
23
|
300
|
The above table is a comparison of the germination rate of a new plant against an old strain of the same plant
Test whether there is a significant difference between the rates of germination of the strains (at the 95% level)
Case Study 3
|
Fertilizer Blend
|
|
|
|
|
|
Farm
|
U
|
V
|
W
|
X
|
Y
|
Z
|
1
|
1130
|
1125
|
1350
|
1375
|
1225
|
1235
|
2
|
1115
|
1120
|
1375
|
1200
|
1250
|
1200
|
3
|
1145
|
1170
|
1235
|
1175
|
1225
|
1155
|
4
|
1200
|
1230
|
1140
|
1325
|
1275
|
1215
|
A trial of 6 different blends of fertilizers (U-Z) has been carried out on linseed crop on 4 different farms 91-4). The crop yields of linseed are given in the table. Carry out a 2-way ANOVA
(a) is there a significant difference between farms
(b) is there a significant difference between fertilizers
Case Study 4
SBP (y)
|
DBP (x)
|
SBP (y)
|
DBP (x)
|
112
|
63
|
156
|
100
|
120
|
69
|
124
|
82
|
135
|
70
|
99
|
56
|
142
|
82
|
105
|
65
|
132
|
76
|
124
|
73
|
115
|
67
|
144
|
89
|
119
|
71
|
134
|
76
|
128
|
73
|
|
|
Systolic arterial blood pressure (SBP) and diastolic arterial blood pressure (DBP) are tabulated above for 15 men aged 40-65
(a) carry out linear regression on this data
(b) give the 95% confidence intervals of the slope and intercept
(c) test whether there is a significant relationship between SBP and DBP for this group
(d) use the regression equation to estimate the expected SBP of a man aged 40-65 whose DBP is 75
Part B
You are to carry out an evaluation of your project, in terms of the data collection and treatment aspects of the project. This is to be presented as a brief summary , set out as follows.
1. Project Overview
Give the project title (including supervisor). State the aims of the project - what do you want to achieve? Why is the study being carried out?
2. Define the response(s)
What is being measured? List your types of responses. Are these responses qualitative or quantitative? If qualitative can they be turned into quantitative responses (e.g by giving a score or rating). Are they discrete or continuous?
3. Define the Factors
- What factors (variables) affect your results (responses)?
- Rank the factors - known to influence, suspect to influence, unknown effect
- Divide the factors into controllable and uncontrollable
4. Identify sources of error
What are the sources of error in your study? How can they be minimised? You need to consider the effect of sampling - usually you cannot test the whole population so you want to take a sample of the population. How do you select the sample? How big should the sample be?
Attachment:- data treatment for biologists.rar