Tuscan Lifestyles Customer Strategy

Assignment #1 Tuscan Lifestyles

Report

Siyi Liu sl888

Part 1

1. Logistic regression model

Block 1: Method = Enter

 Omnibus Tests of Model Coefficients Chi-square df Sig. Step 1 Step 6233.253 10 .000 Block 6233.253 10 .000 Model 6233.253 10 .000

 Model Summary Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square 1 24122.211a .117 .258 a. Estimation terminated at iteration number 6 because parameter estimates changed by less than .001.

 Classification Tablea Observed Predicted Bought "Art History of Florence?" Percentage Correct No Yes Step 1 Bought "Art History of Florence?" No 45126 352 99.2 Yes 3838 684 15.1 Overall Percentage 91.6 a. The cut value is .500

 Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 1a last -.095 .003 1150.401 1 .000 .910 total\$ .001 .000 31.701 1 .000 1.001 gender(1) -.761 .036 452.515 1 .000 .467 child -.186 .017 116.097 1 .000 .830 youth -.113 .026 18.724 1 .000 .893 cook -.270 .017 249.075 1 .000 .763 do_it -.539 .027 399.777 1 .000 .583 refernce .235 .027 78.087 1 .000 1.265 art 1.156 .022 2723.273 1 .000 3.176 geog .574 .019 950.087 1 .000 1.776 Constant -1.600 .052 943.311 1 .000 .202 a. Variable(s) entered on step 1: last, total\$, gender, child, youth, cook, do_it, refernce, art, geog.
1. Summary and interpretation
1. Significance of the model: from the omnibus table, we can see that the Model Chi-Square is statistically significant, given the 0.05 cutoff significance level. We can conclude that the model is statistically significant.
2. Model Summary: from the Model Summary table we can see that the Nagelkerke R Square is 0.258, which is pretty good in the case of a logistic regression.
3. Percent Correct Predictions: from the Classification table, we can see that the actual and predicted values are summarized in the table with the percentages of correct classifications. Overall, the model correctly predicts or classifies 91.6% of the purchase. The percent of not buy correctly classified is 99.2%, while the percent of buy correctly predicted is 15.1%.
4. Variables: from the last table of the logistic regression, we can see that all the variables are significant because all the Sig values are less than 0.05, which is the significance level. The Exp(B) column represents the odds ratio, which measures the effects of the predictors variables. Knowing that positive coefficients will have odds ratio > 1 and negative coefficients will have odds ratio < 1, we can conclude that the total\$, refernce, art and geog variables are important. However, the odds ratio of total\$ is close to 1, meaning that the coefficient of this variable is near 0. The rest variables have negative coefficients, given that they have odds ratios less than 1.

Part 2 Decile Analysis of Logistic Regression Results

1,2

[pic 1]

3.

 Case Summaries Bought "Art History of Florence?" Percentile Group of PRE_1 N Mean Sum 1 5000 .39 1935 2 5000 .17 836 3 5000 .10 511 4 5000 .07 368 5 5000 .06 284 6 5000 .04 196 7 5001 .03 139 8 4999 .02 121 9 5000 .02 90 10 5000 .01 42 Total 50000 .09 4522

From the table above we can see that the number of customers is 50,000, the number of buyers is 4522, and the response rate of each decile is shown above.

4.

 Case Summaries Mean Percentile Group of PRE_1 Total \$ spent Months since last purchase # purchases, Children's books # purchases, Youth books # purchases, Cookbooks # purchases, Do-it-yourself books # purchases, Reference books # purchases, Art books # purchases, Geography books 1 257.3526 7.19 1.06 .51 1.07 .47 .56 1.50 1.33 2 224.8692 7.96 .84 .39 .85 .39 .40 .75 .89 3 214.2284 8.62 .79 .37 .80 .37 .38 .48 .70 4 207.6430 8.78 .75 .36 .80 .34 .31 .30 .54 5 199.1118 9.57 .76 .33 .82 .37 .27 .22 .46 6 199.1302 10.94 .75 .36 .86 .39 .26 .16 .39 7 191.3457 12.37 .76 .35 .84 .42 .23 .13 .29 8 191.5499 14.42 .81 .36 .91 .45 .21 .11 .25 9 193.6108 17.86 .96 .41 1.12 .65 .25 .13 .32 10 204.3416 25.87 1.07 .46 1.31 .77 .25 .07 .29 Total 208.3183 12.36 .85 .39 .94 .46 .31 .39 .55

