The CEO has asked you to do some basic statistical analysis of the theaters, looking at the relationship between some basic theatre-specific/demographic factors and theatre revenue. Most of the data is fairly easy (though time consuming and expensive) to get, but some members of your team are concerned about the difficulty of quickly getting accurate and meaningful theatre revenue data for the current year. React to each of the concerns raised below:
a) "Some of the theaters have really old infrastructure and are not even hooked up to our data warehouse, so to get their data will take FOREVER. Plus, with the amount of work those theatres do by hand, I'm sure there will be tons of mistakes in their data, which will just screw up our figures. For now, let's not worry about those theaters and just collect data from those who are hooked up to our network--we can get the data more quickly that way, and it's probably better data to boot."
b) "Most of our theatres play "first run movies", but about 25% of our theatres show "second run" movies and those theatres might really screw up our data. I'd suggest limiting the data analysis to our first run theatres.
c) "Some movies don't make it to all the theatres, you know. It doesn't make much sense to compare the revenues of two theatres when they might be playing very different movies. Instead, we should look at revenue from a well defined subset of movies that were shown at most or all of our theatres (e.g. only the top 20 grossing films of the year).