Case - Bookbinders Book Club
1. Summarize the results of your analysis for all three models. Develop your models using the case data files and then assess them on the holdout data sample.
Linear Regression model for 1600 sample
Figure 1: Coefficientsa
|
Model
|
Unstandardized Coefficients
|
Standardized Coefficients
|
t
|
Sig.
|
B
|
Std. Error
|
Beta
|
1
|
(Constant)
|
.364
|
.031
|
|
11.848
|
.000
|
Gender
|
-.131
|
.020
|
-.143
|
-6.536
|
.000
|
Amount purchased
|
.000
|
.000
|
.060
|
2.464
|
.014
|
Frequency
|
-.009
|
.002
|
-.165
|
-4.170
|
.000
|
Last purchase
|
.097
|
.014
|
.678
|
7.156
|
.000
|
First purchase
|
-.002
|
.002
|
-.075
|
-1.103
|
.270
|
P_Child
|
-.126
|
.016
|
-.309
|
-7.698
|
.000
|
P_Youth
|
-.096
|
.020
|
-.140
|
-4.792
|
.000
|
P_Cook
|
-.141
|
.017
|
-.340
|
-8.520
|
.000
|
P_DIY
|
-.135
|
.020
|
-.212
|
-6.834
|
.000
|
P_Art
|
.118
|
.019
|
.200
|
6.061
|
.000
|
a. Dependent Variable: Choice (0/1)
|
Linear Regression for holdout sample
Figure 2: Coefficientsa
|
Model
|
Unstandardized Coefficients
|
Standardized Coefficients
|
t
|
Sig.
|
B
|
Std. Error
|
Beta
|
1
|
(Constant)
|
.183
|
.018
|
|
9.933
|
.000
|
Gender
|
-.070
|
.012
|
-.114
|
-5.825
|
.000
|
Amount purchased
|
8.682E-5
|
.000
|
.029
|
1.346
|
.179
|
Frequency
|
-.007
|
.001
|
-.188
|
-5.037
|
.000
|
Last purchase
|
.038
|
.009
|
.382
|
4.025
|
.000
|
First purchase
|
.001
|
.001
|
.048
|
.793
|
.428
|
P_Child
|
-.059
|
.011
|
-.211
|
-5.624
|
.000
|
P_Youth
|
-.043
|
.013
|
-.096
|
-3.457
|
.001
|
P_Cook
|
-.061
|
.010
|
-.229
|
-5.918
|
.000
|
P_DIY
|
-.077
|
.012
|
-.192
|
-6.554
|
.000
|
P_Art
|
.079
|
.013
|
.171
|
6.087
|
.000
|
Binary logit model for 1600 sample
Figure 3: Model Summary
|
|
Step
|
-2 Log likelihood
|
Cox & Snell R Square
|
Nagelkerke R Square
|
|
1
|
1392.159a
|
.225
|
.333
|
|
a. Estimation terminated at iteration number 6 because parameter estimates changed by less than .001?
|
|
Figure 4: HosmerandLemeshowtest
|
|
Step
|
Chi-square
|
df
|
Sig.
|
|
1
|
3.061
|
8
|
.930
|
|
Figure 5: Variables in the Equation
|
|
B
|
S.E.
|
Wald
|
df
|
Sig.
|
Exp(B)
|
Step 1a
|
Gender
|
-.863
|
.137
|
39.443
|
1
|
.000
|
.422
|
Amountpurchased
|
.002
|
.001
|
5.542
|
1
|
.019
|
1.002
|
Frequency
|
-.076
|
.017
|
20.709
|
1
|
.000
|
.927
|
Lastpurchase
|
.612
|
.094
|
42.526
|
1
|
.000
|
1.844
|
Firstpurchase
|
-.015
|
.013
|
1.333
|
1
|
.248
|
.985
|
P_Child
|
-.811
|
.117
|
48.319
|
1
|
.000
|
.444
|
P_Youth
|
-.637
|
.143
|
19.741
|
1
|
.000
|
.529
|
P_Cook
|
-.923
|
.119
|
59.677
|
1
|
.000
|
.397
|
P_DIY
|
-.906
|
.144
|
39.738
|
1
|
.000
|
.404
|
P_Art
|
.686
|
.127
|
29.178
|
1
|
.000
|
1.986
|
Constant
|
-.352
|
.214
|
2.689
|
1
|
.101
|
.704
|
a. Variable(s) entered on step 1:Gender, Amountpurchase, Frequency, Lastpurchase, Firstpurchase, P_Child, P_Youth, P_Cook, P_DIY, P_Art
|
Figure 6: Classification Tablea
|
|
|
Observed
|
Predicted
|
|
|
Choice (0/1)
|
Percentage Correct
|
|
|
0
|
1
|
|
Step 1
|
Choice (0/1)
|
0
|
1120
|
80
|
93.3
|
|
1
|
240
|
160
|
40.0
|
|
Overall percentage
|
|
|
80.0
|
|
a. The cut value is.500
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Binary logit model for holdout sample
Figure 7: Model Summary
|
|
Step
|
-2 Log likelihood
|
Cox & Snell R Square
|
Nagelkerke R Square
|
|
1
|
1111.384a
|
.109
|
.243
|
|
a. Estimation terminated at iteration number 6 because parameter estimates changed by less than .001?
|
|
Figure 8: HosmerandLemeshowtest
|
|
Step
|
Chi-square
|
df
|
Sig.
|
|
1
|
7.851
|
8
|
.448
|
|
Figure 9: Variables in the Equation
|
|
B
|
S.E.
|
Wald
|
df
|
Sig
|
Exp(B)
|
Step 1a
|
Gender
|
-.942
|
.164
|
33.164
|
1
|
.000
|
.390
|
Amountpurchased
|
.001
|
.001
|
2.205
|
1
|
.138
|
1.001
|
Frequency
|
-.112
|
.020
|
32.302
|
1
|
.000
|
.894
|
Lastpurchase
|
.412
|
.115
|
12.759
|
1
|
.000
|
1.510
|
Firstpurchase
|
.007
|
.014
|
.216
|
1
|
.642
|
1.007
|
P_Child
|
-.679
|
.140
|
23.519
|
1
|
.000
|
.507
|
P_Youth
|
-.479
|
.173
|
7.639
|
1
|
.006
|
.619
|
P_Cook
|
-.724
|
.142
|
25.833
|
1
|
.000
|
.485
|
P_DIY
|
-.953
|
.167
|
32.401
|
1
|
.000
|
.386
|
P_Art
|
.804
|
.150
|
28.762
|
1
|
.000
|
2.234
|
Constant
|
-1.150
|
.248
|
21.443
|
1
|
.000
|
.317
|
a. Variable(s) entered on step 1:Gender, Amountpurchase, Frequency, Lastpurchase, Firstpurchase, P_Child, P_Youth, P_Cook, P_DIY, P_Art
|
Figure 10: Classification Tablea
|
|
|
Observed
|
Predicted
|
|
|
Choice (0/1)
|
Percentage Correct
|
|
|
0
|
1
|
|
Step 1
|
Choice (0/1)
|
0
|
2081
|
15
|
99.3
|
|
1
|
177
|
27
|
13.2
|
|
Overall Percentage
|
|
|
91.7
|
|
a. The cut value is.500
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2. Interpret the results of these models. In particular, highlight which factors most influenced the customers' decision to buy or not to buy the book.
3. Bookbinders is considering a similar mail campaign in the Midwest where it has data for 50,000 customers. Such mailings typically promote several books. The allocated cost of the mailing is $0.65/addressee (including postage) for the art book, and the book costs Bookbinders $15 to purchase and mail. The company allocates overhead to each book at 45 percent of cost. The selling price of the book is $31.95. Based on the model, which customers should Bookbinders target? How much more profit would you expect the company to generate using these models as compared to sending the mail offer to the entire list?
4. Based on the insights you gained from this modeling exercise, summarize the advantages and limitations of each of the modeling approaches. Look at both similar and dissimilar results.
Attachment:- Case Assignment.rar