Problem 1
The goal of this simulation problem is to examine how accurate the normal approximation to binomial is without the continuity correction.
Create a function that takes in x (the number of successes we are interested in), n the total sample size, p the proportion of successes, and Which Way which will take on characters "<", ">","<=",">=". This function should perform the following steps:
Step i. Find the exact probability we are interested in using the built in R function pbinom (see handout for details). Notice you should have an if else chain which calculates the probability based on Which Way. Call this value True Prob.
Step ii. Inside the if else chain, you should also calculate the normal approximation to binomial for the value of x. Do not implement the continuity correction. Call this value Approx Prob.
Step iii. Your function should output the difference between the exact probability and the approximate probability. Then, use your function to calculate the absolute value of the difference between the exact and approximate probability for the following probabilities and parameters:
(a) P(X < 7) for n = 15, p = 0.40
(b) P(X ≥ 5) for n = 30, p = 0.20
(c) P(X ≤ 10) for n = 50, p = 0.25
(d) P(X > 50) for n = 200, p = 0.10
Problem 2
The goal of this simulation problem is to examine how accurate the normal approximation to binomial is with the continuity correction.
Create a function that takes in x (the number of successes we are interested in), n the total sample size, p the proportion of successes, and Which Way which will take on characters "<", ">","<=",">=".
Notice that you should be able to modify your previous function (give it a new name though), and this time implement the continuity correction. Then, use your function to calculate the absolute value of the difference between the exact and approximate probability for the following probabilities and parameters:
(a) P(X < 7) for n = 15, p = 0.40
(b) P(X ≥ 5) for n = 30, p = 0.20
(c) P(X ≤ 10) for n = 50, p = 0.25
(d) P(X > 50) for n = 200, p = 0.10
Problem 3
This problem will use your functions from Problem 1 and Problem 2 to see how the approximations perform when n increases. Your function should take in a vector of sample sizes all.n, the probability of success p, and Which Way which will take on characters "<", ">","<=",">=".
Create a function that calculates the following binomial probability:
P(X ≤ (n/2)) and returns a matrix of all. n rows, and two columns; the first column being the difference between the actual probability and the normal approximation to binomial without the continuity correction for all values in all. n, and the second being the difference between the actual probability and the normal approximation to binomial with the continuity correction for all values in all. n.
Hint: sapply will automatically create a matrix if you return two values in your sapply loop. If it is the reverse of what you want, you can use X = t(X) to flip the matrix.
Then, use your function to do the following:
(a) For all.n = seq(10,400,2), p = 0.50, and the appropriate value of WhichWay, use your function to get back a large matrix of the differences. Do not print out this matrix!. Find the mean of each column and report the two means.
(b) Then, create a histogram of the differences between the actual and approximate probabilities for each method (with and without continuity). Report these two histograms (make sure you label them!).
(c) Also create two separate line plots where the values of all.n are on the x-axis, and the difference between the actual and approximate probabilities are on the y-axis. Report the two line graphs (make sure you label them!) Notice the default axis limits.
(d) Do you think the continuity correction is worth implementing? I.e, if you were approximating a binomial probability with a normal distribution, would you use the continuity correction? You can use parts (a), (b) and (c) to support your answer.