1. Data analysis is _____ when patterns in the collected data guide the data analysis or suggest revisions to the preliminary data analysis plan.
A) confirmatory
B) descriptive
C) exploratory
D) conclusive
E) forensic
2. The primary benefit of exploratory data analysis is its ability to be _____.
A) conclusive
B) exclusive
C) discriminatory
D) flexible
E) inexpensive
3. Hypothesis testing is conducting using _____ data analysis techniques.
A) exploratory
B) confirmatory
C) qualitative
D) quantitative
E) forensic
4. Data analysis is _____ when the analytical process is guided by classical statistical inference in its use of significance and confidence.
A) confirmatory
B) descriptive
C) exploratory
D) conclusive
E) forensic
5. Visual representations of data values are consistent with which form of data analysis?
A) confirmatory
B) exploratory
C) conclusive
D) descriptive
E) hypothesis-testing
6. All of the following are contained in a frequency table except _____.
A) lowest value
B) percent
C) count
D) cumulative percent
E) quartile indicators
7. When will the valid percent figure in a frequency table differ from the percent figure?
A) when the cumulative percent exceeds 100%
B) when there are missing cases
C) when using replacement for missing values
D) when the frequency exceeds 10
E) valid percent and percent will always be the same
8. Which type of graphical depiction of frequency values uses a circle shape divided into triangles such that each triangle represents the frequency value it represents?
A) histogram
B) pie chart
C) line graph
D) Pareto diagram
E) stem-and-leaf display
9. Considering the various choices for displaying data, which one will include values for which no observations occurred?
A) pie chart
B) bar chart
C) heretical triangle
D) stem-and-leaf display
E) boxplot
10. Which type of chart uses bars to represent data values such that each value occupies an equal amount of area within the enclosed area?
A) bar chart
B) pie chart
C) histogram
D) stem-and-leaf display
E) line chart
11. Which of the following display choices is most similar to the histogram?
A) bar chart
B) pie chart
C) boxplot
D) stem-and-leaf display
E) line graph
12. Histograms can be used to display the _____ of a distribution.
A) skewness
B) kurtosis
C) modal pattern
D) all of the above
E) none of the above
13. The vertical axis on a histogram indicates the _____.
A) number of observations in each interval
B) midpoint for each interval
C) variable of interest
D) cumulative percent
E) valid percent
14. The horizontal axis on a histogram indicates the _____.
A) number of observations in each interval
B) midpoint for each interval
C) variable of interest
D) cumulative percent
E) valid percent
15. A tree-type frequency distribution that specifies each data value without equal interval grouping is called a _____.
A) bar chart
B) histogram
C) stem-and-leaf displays
D) pie chart
E) line graph
16. What information is presented in a stem-and-leaf display?
A) percent
B) actual values
C) valid percent
D) cumulative percent
E) value labels
17. A stem-and-leaf display differs from a histogram in that a stem-and-leaf display _____.
A) presents actual data values
B) groups values into intervals
C) represents data using bars or asterisks
D) illustrates each interval with a color for visualization
E) is appropriate for nominal data
18. In a stem-and-leaf display, each piece of information on the stem is called a _____.
A) leaf
B) stem
C) trunk
D) flower
E) root
19. Consider the following array of values found in a stem-and-leaf display: 5 I 4 6 7 8 8 9 9 9 9. Which of the following statements best reflects the meaning of this line?
A) there is nine items in the data set whose first digit is five
B) the numbers in the data set range from 4 to 9
C) the variance in the data set is 5
D) the mean of the data set is 5
E) the values in the display are 95, 85, 75, 65, and 45; 95 appears four times and 85 appears twice
20. Consider the following array of values found in a stem-and-leaf display: 6 I 4 5, 6, 9. which of the following is a value found in the data set?
A) 4
B) 5
C) 6
D) 69
E) a, b, and c
21. Which type of display represents frequency data as a bar chart, ordered from most to least, overlayed with a line graph denoting the cumulative percentage at each variable level?
A) stem-and-leaf display
B) histogram
C) Pareto diagram
D) pie chart
E) line graph
22. Which of the following best explains the meaning of the 80/20 rule?
A) 80% of quality defects are caused by 20% of problems
B) an 80% improvement in performance can be expected by eliminating 20% of the causes of poor performance
C) 20% of sales are generated by 80% of customers
D) the benefits of performance increases accrue to 20% of people
E) all of the above are interpretations of the 80/20 rule
23. All of the following are sources of data that can be used in a Pareto diagram except _____.
A) multiple-choice-single-response scales
B) multiple-choice-multiple-response scales
C) frequency counts of words
D) dichotomous scales
E) none of the above are sources of data used in Pareto diagrams
24. A _____ reduces the detail of the stem-and-leaf display and provides a different visual image of the distribution's location, spread, shape, and tail length.
A) histogram
B) box plot
C) Pareto diagram
D) frequency table
E) geographic map
25. Measures that are resistant are _____.
A) inappropriate for statistical analysis
B) corrupted with measurement bias
C) based on nominal scales
D) able to resist influence of extreme values
E) sensitive to localized data
26. Which of the following statistics is resistant?
A) mean
B) standard deviation
C) variance
D) median
E) all of the above
27. Which of the following statistics is nonresistant?
A) median
B) mode
C) range
D) quartiles
E) standard deviation
28. All of the following are ingredients of box plots except _____.
A) rectangular plot that encompasses 50% of the data values
B) center line marking the median
C) jagged line marking the mean
D) box hinges
E) whiskers that extend from the right and left hinges to the largest and smallest values
29. Data points that exceed plus 1.5 the interquartile range are called _____.
A) extremists
B) outcasts
C) errors
D) outliers
E) skewed data
30. Outliers are those data points that exceed _____ the interquartile range.
A) +1
B) +1.5
C) +2
D) +2.5
E) +3
31. RFID data refers to data captured using _____.
A) radio frequency identification chips
B) infrared technology
C) global positioning systems
D) mapping
E) optical mark recognition software
32. Which of the following is an appropriate situation for the use of mapping?
A) identification of competing businesses to identify location for a new store
B) identification of the geographic location of target segments
C) plotting the geographic rollout of a new product
D) plot responses to promotions according to geographic location
E) all of the above
33. Which of the research questions/hypotheses below is best answered using cross-tabulations?
A) What percentage of men and women prefer brand A over brand B?
B) What percentage of residents shop at the local grocery store?
C) Is brand loyalty related to brand image?
D) What happens to sales when prices drop?
E) Where do most of our consumers live?
34. A statistical technique that describes two or more variables simultaneously and results in tables that reflect the joint distribution of two or more variables that have a limited number of categories or distinct values is a(n) _____.
A) t test
B) ANOVA
C) factor analysis
D) cross-tabulation
E) regression
35. Which of the following best expresses the value of using percentages in data presentation?
A) allows for relative comparisons
B) allows for mathematical manipulation of the values
C) focuses on the count of cases
D) provides for the calculation of marginals
E) all of the above
36. The introduction of a third variable in cross-tabulation can result in which of the following possibilities?
A) refined association between the two original variables
B) no association between the two original variables
C) no change in the initial pattern
D) all of the above
E) none of the above
37. When are cross-tabulations called contingency tables?
A) anytime
B) when used for statistical testing
C) when used for display purposes
D) when presented with percentages rather than raw counts
E) when used to present data
38. The _____ is used to test the statistical significance of the observed association in cross-tabulation.
A) contingency coefficient
B) Cramer's V
C) phi coefficient
D) chi-square coefficient
E) Z score
39. What oversight has occurred in the following example? The price is reduced by 400% because it was $1 and is now just 25 cents.
A) use of too large percentages
B) percentage decreases can not exceed 100%
C) use of too small a base
D) averaging percentages
E) percentages cannot exceed 100%
40. The purpose of a control variable is to _____.
A) help interpret the relationship between variables
B) control for outliers
C) provide a comparison for the results
D) establish precision
E) all of the above
41. Cross-tabulations with more than two variables are called _____ tables.
A) multivariate
B) n-way
C) kth
D) CHAID
E) nested
42. The data partitioning procedure that can search up to 300 variables for the single best predictor of a dependent variable is called _____.
A) cross-tabulation
B) regression
C) automatic interaction detection
D) data mining
E) factor analysis
43. Suppose you cross tabulate consumption rates with income and find no apparent relationship. When age is introduced you find that there is a relationship between consumption and income within age groups. This is an example of a(n) ________ variable at work.
A) Antecedent
B) Component
C) Extraneous
D) Intervening
E) Suppresser
44. Z scores
A) Convey distance in units of a ratio of the mean.
B) Convey distance in covariation units.
C) Are useful to make comparisons among variables from different scales.
D) All the above.
45. Boxplots are said to incorporate resistant statistics because:
A) They are constructed with a five-number summary.
B) Do not contain weak measures of central tendency like the median.
C) Are based on a new type of dispersion statistics.
D) Use whiskers to represent quartiles.
46. Marginals are the totals of row and/or column variables in a:
A) Piechart
B) Barchart
C) Cross-classification (cross-tabulation) table
D) Frequency table.
47. Reexpression of data on a new scale is called
A) resistent statistics
B) nonresistent statistics
C) transformation
D) cross-tabulation
E) none of the above
48. Which of the following is a barchart arranged in increasing order by size?
A) control chart
B) simple bar chart
C) Pareto diagram
D) histogram
49. A _____ arrays category codes from lowest value to highest value with columns for count, percent, valid percent, and cumulative percent.
A) boxplot
B) histogram
C) stem-and-leaf diagram
D) Pareto diagram
E) frequency table
50. Which of the following is most appropriately displayed with a frequency table?
A) what percentage of people prefer Hunt's brand ketchup
B) what is the relationship between gender and brand preference
C) how much explanatory value comes from the study's variables
D) where do the most valuable customers live
E) is advertising more effective in newspapers or magazines
Essay Questions
51. What advantages do stem-and-leaf displays provide over histograms?
52. Explain how to develop a stem-and-leaf display.
53. What can a researcher determine through the use of cross-tabulations?
54. Explain why percentage decreases can never exceed 100%.