Biostatistics poph90013 assignment what percentage of data


Biostatistics Assignment

Question 1 -

Cardiovascular disease (CVD) is a major cause of death in Australia. Risk factors for cardiovascular disease include high blood pressure, high cholesterol, smoking and alcohol, among others. Figure 1 below summarises data from a study of cardiovascular disease in 852 Australian men.  In Figure 1, the vertical axis is total cholesterol level and on the horizontal axis is the risk group. Risk groups were defined as the number of CVD risk factors of each individual: (i) low risk (0-1 risk factor), (ii) medium risk(2-3 risk factors) and (iii) high risk (4+ risk factors).

1464_Figure.png

a) Describe how the distribution of total cholesterol differs between risk groups in Figure 1. Limit comments to comparing the location, spread, shape of distribution, and maximum/minimum of the Total Cholesterol. Note, there is no need to report the numerical value of the summary statistics you use, just refer to the name of the summary statistic you are comparing (e.g., median increases/decreases).

b) Desirable total cholesterol level is thought to be below 5.2 mmol/L. High cholesterol is defined as total cholesterol above 6.2 mmol/L. Using Figure 1, roughly estimate the proportion of Australian men in each risk group with (i) desirable cholesterol and (ii) high cholesterol.

Question 2 -

In order to answer Question 2, you will need to use the Stata dataset assignment1.dta which can be downloaded from the folder "Assignment 1" in the Assessment area on the LMS.

The Wound Healing Society defines a chronic wound as one that has failed to proceed through an orderly and timely reparative process to produce anatomic and functional integrity within an expected period. Chronic wounds represent a significant annual burden on the Australian health care system, with direct health care costs reaching US$2.85 billion. Several factors can interfere with one or more phases of the wound healing process, thus causing improper or impaired wound healing. Such factors include infection, age, stress, diabetes, obesity, medications, alcoholism, smoking, and nutrition. A better understanding of the influence of these factors on repair may lead to therapeutics that improve wound healing and resolve impaired wounds.

This question concerns a new hypothetical study investigating the association between wound healing and alcoholism. As part of the study, wound patients were sampled from a randomly selected hospital and given uniform treatment. For each patient, the wound circumference was measured at baseline and 12 weeks after baseline to determine the reduction in wound circumference in week 12 (i.e., 12 weeks after baseline).  The investigators measured the following variables:

Variable

Description

wndcir

Relative difference between baseline and week 12 wound circumferences; i.e.,

(baseline circ. - week 12 circ.) / baseline circ.

age

Age (years)

sex

Gender (male/female)

stress

Stress score (Possible scores 0-10; 0 = no stress, 10 = maximum stress)

bmi

Body mass index (kg/m2)

diab

Type II diabetes (yes/no)

smoke

Smoking (ever/never)

alc

Alcohol consumption per week in millilitres

infect

Was the wound infected at any time in twelve weeks? (yes/no)

The main outcome variable of the study was wndcir, which measures the healing progress of the wound. The progress of healing (wndcir) can be interpreted as follows:

  • Large positive values: the wound is healing quickly (i.e., good healing progress).
  • Positive values close to zero: the wound is healing slowly (i.e., slow healing progress).
  • Zero: No change in the wound circumference.
  • Negative values: the wound is getting worse; the circumference is increasing.

In order to answer the questions below, you will need to create two new categorical variables: alccat and bmicat. The new categorical variable alccat will be derived from the existing numerical variable alc. The variable alccat represents whether or not individuals drink more than the average amount of alcohol per week estimated to be 186 mL/week. The new categorical variable bmicat will be derived from the existing numerical variable bmi. The new variables alccat and bmicat should consist of the following categories:

alc(mL/weel)

alccat

bmi(kg/m2)

bmicat

< 186

average

< 19

underweight

>= 186

above average

19 - 24

normal weight

 

 

25 - 29

overweight

 

 

> = 30

obese

a) Identify and list the names of all variables in the data set that have missing observations.

b) What percentage of data is missing for the variable smoke? What percentage of data is missing for the variable smoke for each BMI category (i.e., of all individuals that are underweight/normal/overweight/obese, what is the percentage of data missing for the variable smoke)?  Of all individuals that consume below the average amount of alcohol per week (<186 mL/week), what percentage are female and what percentage are male?

c) Use Stata to produce a frequency histogram of alcohol consumption per week in millilitres (variable alc) for each of the four categories of BMI. Look up the help file for the histogram function for the relevant options to make the following changes to the graph:  

  • use the by() option to display the four histograms in a single plot;
  • display the percentage, not the density, on the vertical axis;
  • plot 25 bars (or bins) per histogram.

Copy the graph directly into your assignment document by clicking on edit/copy in the Stata graph window. You may also use the "File -> Save As" feature in Stata to save the graph as an image that you can later import into Microsoft Word.

From the histogram, what shape is the distribution of the variable alc for individuals with "normal weight"?

d) Provide a table that summarises the distribution (sample size, mean, standard deviation, minimum, 25th   / 50th   / 75th percentiles, maximum) of the wound circumference (wndcir) separately for smokers and non-smokers. Ensure that the table is formatted properly (please do not copy and paste directly from Stata output). Using this table, briefly describe the differences in the outcome variable wndcir between smokers and non-smokers. Do smokers or non-smokers heal faster on average?

e) Use Stata to produce an appropriate graph to display the relationship between the outcome variable wndcir  and  the exposure variable  wound infection history (infect). Based only on this graph, do individuals with history of infections heal faster or slower, on average?

The investigators of the study are not sure how alcohol relates to wound healing but they suspect that drinking alcohol above the weekly average can slow down the healing process.

f) Using Stata, calculate and interpret the difference in the mean wound circumference (outcome variable  wndcir)  between  individuals who drink below  (<186 mL/week)  and  those that  drink  above  the  average amount of alcohol per week  (>=186 mL/week).

Using Stata, calculate and interpret a 95% confidence interval for the difference in the population mean wound circumference (outcome variable wndcir) between individuals who drink below (<186 mL/week) and those that drink above the average amount of alcohol per week (>=186 mL/week).  

g) Using your answers to question 2(f), what can we conclude about the association between drinking alcohol and the wound healing process?

Question 3 -

Note: This question does not require Stata.

Table 1 below gives the results for two randomised controlled trials comparing acupuncture treatments versus standard care in patients with back pain. The outcome measure was the SF-36 bodily pain score; this score is normally distributed and ranges from 0 to 100, where a score of 100 implies 'no pain'.  An increase of 10 units in the SF-36 bodily pain score corresponds to a clinically important difference.

Table 1.

Trial

n per group

Difference in sample means of SF36 (acupuncture- standard care)

95 % confidence interval for difference in population means

p-value

1

??

3.00

0.26, 5.74

0.032

2

??

2.75

-3.13, 8.63

0.359

The above two trials are the only studies currently available with data comparing acupuncture treatments and standard care. After reviewing the findings of the above two trials, a general practitioner decides to recommend acupuncture treatments to patients suffering from back pain.

a) Do you agree with the general practitioner's decision? Using all the information provided in Table 1, give reasons as to why or why not.

b) Using the information provided in Table 1, which trial has the larger sample size? Explain your answer.

Question 4 -

Note: This question does not require Stata.

A random sample of 61 airline pilots, working for British Airways, had their systolic blood pressure measured. The sample mean was 107.4 mmHg and the sample standard deviation was 6.1 mmHg.

Assume that systolic blood pressure is normally distributed within the population and that the sample mean and sample standard deviation provide reasonable estimates of the population parameters.  

a) Estimate the proportion of British Airways pilots with a systolic blood pressure between 100 mmHg and 118 mmHg.

b) Calculate the range of systolic blood pressures where the middle 90% of airline pilots lie within.

c) Calculate and interpret a 99% confidence interval for the population mean systolic blood pressure.

d) Calculate and interpret a (two-sided) p-value to test the null hypothesis that the population mean systolic blood pressure is 105.8 mmHg.

Request for Solution File

Ask an Expert for Answer!!
Applied Statistics: Biostatistics poph90013 assignment what percentage of data
Reference No:- TGS02253648

Expected delivery within 24 Hours