The following strategy was adopted in an attempt to determine the size, N , of the population of an almost extinct population of rare tigers in a remote forest in southeast Asia. At the beginning of the month, 50 tigers were selected from the population, tranquilized, tagged and released; assuming that a month is suffcient time for the tagged sample to become completely integrated with the entire population, at the end of the month, a random sample of n = 10 tigers was selected, two of which were found to have tags.
(i) What does this suggest as a reasonable estimate of N ? Identify two key potential sources of error with this strategy.
(ii) If X is the random variable representing the total number of tagged tigers found in the sample of ntaken at the end of the month, clearly, X is a hypergeometric random variable. However, given the comparatively large size we would expect of N (the unknown tiger population size), it is entirely reasonable to approximate X as a binomial random variable with a "probability of success" parameter p. Compute, for this (approximately) binomial random variable, the various probabilities that X = 2 out of the sampled n = 10 when p = 0.1, p = 0.2 and p = 0.3. What does this indicate to you about the "more likely" value of p, for the tiger population?
(iii) In general, for the binomial random variable X in (ii) above, given data that x = 2 "successes" were observed in n = 10 trials, show that the probability that X = 2 is maximized if p = 0.2.