Sampling methodology resulted in non-random samples


As engineers it is important that you know how to analyze data and draw conclusions from it. In the Statistics section of this course you learn many tools for doing this. With these tools you can say something about a large pool of objects/numbers (the “population”) by looking at only a few of them (the “sample”). Even with all of the tools that you will learn in this course, engineers and other users of statistics will sometimes end up with invalid conclusions because they have not selected the few they will look at wisely.

In designing an experiment, typically one first forms a hypothesis, that if true or if not true will tell you something important. Your hypothesis should be such that you believe your data if analyzed correctly, will be able to establish the truth of your hypothesis or not.

An example hypothesis, might be “about 10% of cars pass through intersections with traffic lights when the lights are yellow”. This statistic may be important in deciding how much time to give before pedestrian signals change or the traffic lights change in the other direction or how bike paths are configured that pass through the intersection.

The population that you will study might be all intersections with traffic lights in Montreal. Another population might be all intersections with traffic lights, in some smaller geographical entity, like Hampstead.

Whatever you choose as a population, this may lead you to modify your hypothesis to be “about 10% of cars that pass through intersections with traffic lights when the lights are yellow, in Hampstead.”

After designing your experiment, and then drawing conclusions, you might also want to think if the results of your study can be extrapolated or extended more broadly to populations that were not part of the study. For example does one think that the results from Hampstead could be extrapolated to cover all intersections with traffic lights in Montreal. Once one has established the population, and the hypothesis, one will want to select certain members of the population to measure. This subset of members of the entire population is called a sample. It is not always easy to select members of the sample so that one can draw valid conclusions about the entire population only by measuring the sample.

In this course you will try to do the selection to form a “random sample”. We will define this term later in the course, but the essence of it is that members in the sample are unrelated to each other. This is not always so easy to achieve. Sometimes the way you select the sample will bias it one way or another.

Here are two examples where the sampling methodology resulted in non-random samples of the population.

A. In 1936 there was a presidential election in the US. Prior to election day a telephone survey was done and it was predicted that the Republican leader would win. On election day the democratic candidate Franklin Roosevelt won by a landslide. What went wrong? Answer: In 1936 not everyone had a telephone. Those that did tended to be richer who tended to vote republican. Thus the overall sample was biased and did not represent the overall population.

B. In a plant you want certain parts tested to see if they meet spec. You give the operators of the machines that make these parts a list of the parts you want them to set aside for testing. Your tests indicate that the parts are well in spec. But many complaints come from the field that that parts are out of spec. What is happening? Answer: Your operators try hard to please you by doing a very good job on the parts that they know you will be testing. This sample ends up not representing the entire population.

The procedures for the project are given as follows.

1. Groups of three will be formed by the TA after the Drop/Add deadline. Students will be informed by email of which group they belong to.

2. Each group will write a proposal for the project. The proposal is typically one page in length.

It should clearly state:

a. The population you are trying to get information on.

b. The hypothesis you are testing with your experiment.

c. How you will select certain members of your population to measure. i.e. How you will do your sampling.

d. The quantity you will measure from each member of the sample. For example, number of cars passing through a particular yellow light, total number of cars passing through the intersection (on both yellow and green [and red if that happens]), so that the portion of total cars passing through the intersection on a yellow light can be calculated.

This must be submitted to the Project TA’s with an expectation of originality form. All the members sign the proposal and submit it to the TAs for approval.

3. Approval by the TAs

After the submission of the proposal, the TAs will read your proposal and give you a decision of whether or not your proposal is satisfactory. If not, you need to revise the proposal and submit it again.

4. Experiments: Based on your approved proposal, now you can conduct the experiment. Your sample size must be at least 50.

5. Analysis:

Based on the experiments, now you can use some tools and your own calculation to analyze the experiments. You are required to calculate the sample mean, sample median, sample variance, sample standard deviation.

6. Comments

After the analysis, you are required to comment on the results. In particular you should comment

a) Did your method of sampling result in a random sample?

b) If your sample was not a random sample what sorts of measures could you take if you were to do this project again, to get a random sample?

c) Is your hypothesis true or not? Based on the experiment, would it be appropriate to write a revised hypothesis (“about 15% of cars in Hampstead go through yellow lights.”)

d) Comment on whether you think your results can be extrapolated to draw more general conclusions, perhaps on wider populations. State your opinion and then back it up with well thought out reasons.

This is a vital part of this project and you should spend some time doing a good job on this part. For your final submission you should submit:

a) First page of the report is the originality form and signed by all the members

b) The second part of the report is the blog of the project, signed by the members. This is a record of each time work was done on this project, the date, what was done, and who was doing this work. A required piece of work is a meeting of all members of the team to review the final report.

c) You should include the proposal that was approved by the TA.

d) Your report should clearly state the details of the experiment you did.

e) The final report should contain all the numbers in your sample.

f) The final report should contain the analysis in number 5 above.

g) The final report should contain the comments in number 6 above.


Project proposal example 1:

For this project, we plan to analyze the number of waiting vehicles at the intersection of Rue Guy and Blvd Ste-Catherine when the traffic lights are red. We start to count the number of waiting vehicles at 7.00am and end at 12.00am.

One student is responsible for the direction on the road of Rue Guy and the other is responsible for the direction of Blvd Ste Catherine.

At 7.00am (starting from first red light), one student counts the number of waiting vehicles in both directions, when Rue Guy has read lights. This is one sample. Repeat the same for every red light.

In the same time, the other student counts the number of waiting vehicles in both directions, when Blvd Ste Catherine has red lights. This is one sample. Repeat the same for every red light.

After collecting these samples, you can plot histograms for the number of waiting vehicles with time, one histogram for Rue Guy and the other histogram for Blvd Ste Catherine.

Based on the samples, we will calculate the mean, medium, and variance of the number of waiting vehicles on Rue Guy and Blvd Ste Catherine, respectively.

Project proposal example 2:

For this project, we plan to investigate the number of students, faculties and staffs taking Concordia shuttles with time. Starting from 10.00 am (first shuttle after 10.00am), we will count the number of students, faculties and staffs taking the shuttles. We have one sample per shuttle that is leaving from downtown to Loyola. Collect 30 samples at least for this-direction shuttles.

Similarly, we can have samples for the shuttles that are leaving from Loyola to downtown. After the collection of the samples, we will plot histograms of the samples for each direction.

Based on the samples, we will calculate the mean, medium and variance of the samples for each direction.

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: Sampling methodology resulted in non-random samples
Reference No:- TGS01238157

Expected delivery within 24 Hours