Question 1. James Surowiecki recently published a book entitled The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. In this book, Surowiecki argues that when groups make decisions they typically do better than when individuals make decisions because groups have better information. Whereas the individual makes decisions based only on what he or she knows, the group is capable of aggregating the information of each member in the group and making decisions based on the collective information of a large group. A good example of this phenomenon can be found in the game show "Who Wants to Be a Millionaire." Each contestant gets three "life lines" including one called ask the audience, where the question posed to the contestant is posed to the audience who all vote on the answer to the question. While there is typically a bit of noise in the audience's response, generally speaking the audience is correct.
In a lot of ways, this is similar to the law of large numbers mentioned in the lecture notes a couple weeks ago. A sample of 1 has a huge margin of error, but if you get a larger sample, your margin of error falls and the more likely it is that the sample mean is equal to the population mean.
Based on your intuition, what is your overall impression of using the power of crowds in a business setting-strengths/weaknesses, good applications/bad applications, etc.?
Question 2. In multiple regression, the relative size of the coefficients is not important. For example, your company may have a nationwide hiring program that focuses on hiring employees who have graduated from college in the past 3 years, and let's say you want to know what attributes of those graduates has the biggest influence on sales ($M) in their first year on the job. You hypothesize that the factors that will influence first year sales to be undergraduate GPA (GPA), years of experience since graduation (EXP), the quality of their undergraduate institution (RANK), and their performance on the Wonderlic test (TEST). You estimate the regression as:
SALES=1.23+5.03*GPA+.14*EXP-.41*RANK+.38*TEST
It is difficult to compare the size of the various coefficients because each of the independent variables is measured on a different scale. Undergraduate GPA is measured on a scale from 0.0 to 4.0. Experience ranges from 0 to 3. University ranking ranges from 1 to 4 (with 1 being the highest rank), and the Wonderlic test ranges from 0 to 50. Can you think of a way to compare the coefficients? If you are going to take this information to make a decision of where to focus your hiring, which element should you place the highest emphasis on?
Question 3. ANOVA can be a very useful tool and is one of the more powerful statistical techniques for comparing dependent variable outcomes. We have discussed single and two group hypothesis testing. When there are multiple groups you use Analysis of Variance. It is also used in regression modeling.
Think up a question that you might encounter at your workplace (or elsewhere, if you are so inclined) that would be amenable to testing via ANOVA. Post that question along with any issues you think you might encounter should you actually run the analysis, including how you might get the data, sampling issues, etc.