Agency revenues. An economic consultant was retained by a large employment agency in a metropolitan area to develop a
regression model for predicting monthly agency revenues ( y ). She decided that three economic indicators for the area were
potentially useful as independent variables, namely, average weekly overtime hours of production workers in manufacturing ( 1
x ), number of job vacancies in manufacturing ( 2 x ), and index of help wanted advertising in newspapers ( 3 x ). Monthly
observations on agency revenues and the three independent variables were obtained for the past 25 months. The ANOVA table
for the model y=b0 +b1x1 +b2x2 +b3x3 +e is as follows:
Source
|
d.f.
|
SS
|
MS
|
Regression
|
3
|
5409.89
|
1803.30
|
Error
|
21
|
16.35
|
0.78
|
Total
|
24
|
5426.24
|
|
The consultant decided to screen the independent variables to determine the best set for predicting agency revenues..
The regression sums of squares for all possible regression models were found to be as follows:
Independent variables in the model SSR
X1 2970.64
X2 3654.85
X3 3584.54
X1, X2 5123.80
X1, X3 5409.59
X2, X3 3741.30
X1, X2, X3 5409.89
(a) Determine the subset of variables that is selected as best by the forward selection procedure using F0* = 4.2 (to-add-
variable). Show your steps.
(b) Determine the subset of variables that is selected as best by the backward elimination procedure using F0** = 4.1
(to-delete-variable). Show your steps.
NOTE: ( t0** ) 2 = F0**
(c) Determine the subset of variables that is selected as best by the stepwise regression procedure using F0* = 4.2 (to-add)
and F0** = 4.1 (to-delete). Show your steps.
2) In a survey to determine child care costs for working parents, the local Chamber of Commerce randomly samples licensed
child care centers in four regions of the metropolitan area. The purpose of the survey is to see whether and how average child
care expense varies according to region. The following figures are the weekly costs of
child care for a 2-year-old:
Suburban East Downtown Area Suburban West Suburban South
90.00 94.00 82.00 86.50
87.50 97.50 84.50 88.00
89.50 94.00 88.00 89.50
90.00 92.50 85.50 85.00
Establish whether average costs are equal across the four regions; if not, do a follow-up analysis to determine which are the
same and which differ. (Use ? =0.10). List all necessary assumptions and indicate which might be suspect. Also perform a
non-parametric analysis.
Verify your results using SAS.
3) A chain of convenience stores tested a display for a new snack product by placing the display in four different locations in
various stores; at the entrance, in the snack section, by the cash register, and with the soft drinks. Each display location was
utilized in three stores over a one-week test period. Because the 12 stores which used to test the display differ somewhat in
overall sales volume, they were divided into three categories. Within each category, the assignment of stores to display
method was random. Units sold are shown in the following table:
Unit Sales, by Location Display
Store
Sales Volume Entrance Snacks Register Drinks
Below average 46 38 57 54
Average 62 50 67 67
Above average 75 62 89 77
Is it possible to be 99% certain that the display locations' mean sales are not equal? Conduct the appropriate follow-up
analysis (use ? =0.01) to establish which means are significantly different. List all necessary assumptions and indicate which
might be suspect. Also perform a non-parametric analysis. Verify your results using SAS.