Business Analytics II - Winter 2016 - Assignment 4

Managerial Economics &
Decision Sciences Department
Developed for
business analytics II
week 4
week 4
week 3
▌assignment four - solutions
mba for yourself 
mba for your employer 
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four – solutions
dummy variables
Developed for
business analytics II
learning objectives
► statistics & econometrics
 define a dummy variable
 interpret a regression with dummy variables
 understand and interpret slope dummies
►
 run dummy regressions
► (MSN)
 Chapter 5
► (CS)
 MBA (I)
 MBA (II)
readings
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 1: coefficients interpretation. The estimated regression and STATA results are shown below:
Est. E[postMBA]  24.659  1.83628·preMBA  1.732·school
b0
b1
b2
Figure 1. Results for regression of postMBA on preMBA and school
continuous variable
dummy variable
postMBA |
Coef.
Std. Err.
t
P > |t|
----------+-----------------------------------------------preMBA | 1.83628
.04178
43.96
0.000
school |
1.732
1.136
1.52
0.128
_cons |
24.659
1.868
13.20
0.000
► Coefficients interpretation (all estimates of the true parameters 0, 1 and 2)
 b0
 the expected postMBA income level if your income prior MBA was zero (preMBA  0) and if you preMBA  0
completed MBA at school A (school  0)
school  0
preMBA  0
 b0  b2  expected postMBA income level if your income prior to MBA income was zero (preMBA  0)
school  1
and if you completed MBA at school B (school  1)
 b2
 the expected difference in postMBA income level if you complete MBA at school B (school  1) preMBA any
school: 0  1
rather than at school A (school  0)
 b1
 the change of expected postMBA income level if your income prior MBA changes by $1
regardless of which school you attend
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
preMBA  1
school: any
assignment four | page 1
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 1: assumptions. The estimated regression and STATA results are shown below:
Est. E[postMBA]  24.659  1.83628·preMBA  1.732·school
b0
Remark. The underlying modelling (model specification)
assumption is that there is no interaction between school
and preMBA:
b1
b2
Figure 2. Graphical representation of the estimated regression
slope b1  1.83628
differential effect of preMBA
for school  1
For a change in $1 of preMBA income the postMBA will
change by the same amount whether you attended
school A (school  0) or school B (school  1)
 the differential effect of school on
postMBA is the same (given by b2)
for at each level of preMBA
(graphically: the distance between the
two lines is the same for all preMBA
levels)
b2  1.732
differential effect of school
b0  b2  26.391
b0  24.659
 the differential effect of preMBA
on postMBA is the same for both
schools (graphically: the two lines have
the same slope)
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
postMBA
Notice how this assumption has two implications:
slope b1  1.83628
differential effect of preMBA
for school  0
preMBA
assignment four | page 2
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: coefficients interpretation. The estimated regression and STATA results are shown below:
Est. E[postMBA]  30.000  1.70426·preMBA  7.314·school  0.23227schoolpreMBA
b0
b1
b2
b3
Figure 3. Results for regression of postMBA on preMBA, school and schoolpreMBA
continuous variable
dummy variable
slope dummy variable
postMBA |
Coef.
Std. Err.
t
P > |t|
--------------------------------------------------------------preMBA | 1.70426
.06306
27.03
0.000
school |
-7.314
3.447
-2.12
0.034
schoolpreMBA |
.23227
.08364
2.78
0.006
_cons |
30
2.670
11.23
0.000
Remark. Since this is the “complete” slope dummy regression:
 there are four coefficient coming straight from the regression, namely b0, b1, b2 and b3
 there are two combinations of coefficients b0  b2 and b1  b3, that are meaningful.
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 3
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: coefficients interpretation. The estimated regression and STATA results are shown below:
Est. E[postMBA]  30.000  1.70426·preMBA  7.314·school  0.23227schoolpreMBA
difference in
slopes
slopes
difference in
levels
levels
b0
 constant
b0
b1
b2
b3
– the expected postMBA level in the first year after graduation, if your income prior to MBA
was zero (preMBA = 0), if you completed MBA at school A (school = 0)
 coefficient b0  b2 – the expected postMBA level in the first year after graduation, , if your income prior to MBA was
zero (preMBA = 0), if you completed MBA at school B (school = 1)
 coefficient b2
– the expected difference in postMBA level in the first year after graduation, if your income prior
to MBA was zero (preMBA = 0), if you completed MBA at school B (school = 1) vs. at school A
(school = 0)
 coefficient b1
– the expected change in postMBA in the first year after graduation if you
completed MBA at school A (school = 0)
 coefficient b1  b3 – the expected change in postMBA in the first year after graduation, if you
completed MBA at school B (school = 1)
 coefficient b3
– the expected differential effect in the change in postMBA in the first year after
graduation if you completed the MBA at school B (school = 1) vs. at school A (school = 0)
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 4
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: graphical representation. The estimated regression and STATA results are shown below:
Est. E[postMBA]  30.000  1.70426·preMBA  7.314·school  0.23227schoolpreMBA
b0
b1
b2
b3
Figure 4. Graphical representation of the estimated regression
Remark. How do you get the graph?
 The blue line is obtained by plugging school  0
in the estimated regression. The red line is
obtained by plugging school  1 in the estimated
regression.
slope b1  b3  1.936
school  1
slope b1  1.704
school  0
b0  30.000
b2  1.732
b0  b2  22.686
postMBA
 To get the difference between the two lines just
subtract the two resulting equations.
b2  b3preMBA
differential effect of school
preMBA
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 5
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: school choice. The estimated regression and STATA results are shown below:
Est. E[postMBA]  30.000  1.70426·preMBA  7.314·school  0.23227schoolpreMBA
b0
b1
► If preMBA income is $15 (thousands) and complete MBA:
b2
b3
 at school A (school  0): estimated postMBA  30.000  1.70426·15  7.314·0  0.23227·0·15  55.564
 at school B (school  1): estimated postMBA  30.000  1.70426·15  7.314·1  0.23227·1·15  51.734
► If preMBA income is $65 (thousands) and complete MBA:
 at school A (school  0): estimated postMBA  30.000  1.70426·65  7.314·0  0.23227·0·65  140.777
 at school B (school  1): estimated postMBA  30.000  1.70426·65  7.314·1  0.23227·1·65  148.561
► Based on these figures you’d choose school A if your preMBA is $15 but choose school B if your preMBA is $65.
Remark. Two issues here:
 Your choice seems to be changing depending on your preMBA income. How do we explain this feature?
 Based on Regression 1 you would choose school B for any preMBA income… What explains the difference?
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 6
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
School choice – a comparison:
simple dummy estimated regression
slope dummy estimated regression
Est. E[postMBA]  24.659  1.83628·preMBA  1.732·school
Est. E[postMBA]  30.000  1.70426·preMBA  7.314·school  0.23227schoolpreMBA
Figure 5. Graphical comparison of simple dummy and slope dummy estimated regressions
Est.E[postMBA ]  22.686  1.936·preMBA
school  1
Est.E[postMBA ]  26.391  1.836·preMBA
school  1
24.659
Est.E[postMBA ]  24.659  1.836·preMBA
school  0
postMBA
postMBA
30.000
26.391
22.686
preMBA
Est.E[postMBA ]  30.000  1.704·preMBA
school  0
preMBA
► In Regression 1 we are “forcing” the two lines to have the same slope and we are trying to explain the difference in postMBA
income only as a shift in level due to the school choice (this difference is equal to the coefficient of the dummy).
► In Regression 2 we are “allowing” the two lines to have different slopes and the difference in postMBA income is explain as a
compound effect: the dummy variable will pick up the difference in level, due to the school choice, while the slope dummy will pick up
the difference in slope, due to the interaction of school and preMBA income.
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 7
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: intervals. Applying the klincom and kpredint to obtain confidence and prediction intervals, confidence
level 90%, for postMBA income when MBA was completed at school A (school  0) and preMBA income was $40 we
get the following output:
Figure 6. Results for regression of postMBA on preMBA, school and schoolpreMBA
predicted | std.er of est.mean.
CILow CIHigh
PILow PIHigh
---------------------------------------------------------------98.171 |
1.70426 96.868 99.474
79.71 116.63
klincom
kpredint
► Can we infer what were the exact klincom and kpredint commands?
 It is clear that the CI interval is related to klincom and the PI interval is related to kpredint
 The klincom is:
klincom _b[_cons]  _b[preMBA]·40  _b[school]·0  _b[schoolpreMBA]·0, level (90)
 The kpredint is:
kpredint _b[_cons]  _b[preMBA]·40  _b[school]·0  _b[schoolpreMBA]·0, level (90)
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 8
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Regression 2: intervals. Applying the klincom and kpredint to obtain confidence and prediction intervals, confidence
level 90%, for postMBA income when MBA was completed at school A (school  0) and preMBA income was $40 we
get the following output:
Figure 7. Results for regression of postMBA on preMBA, school and schoolpreMBA
predicted | std.er of est.mean.
CILow CIHigh
PILow PIHigh
---------------------------------------------------------------98.171 |
1.70426 96.868 99.474
79.71 116.63
klincom
kpredint
► The question asks about the distribution of postMBA income for individuals not for the average of the 60 students.
Thus we definitely use the kpredint interval.
Remark. How would you phrase the question such that, in answering it, you’d choose the confidence interval rather than the
prediction interval? Answer: For how many cohorts of 60 students in the past 20 years do you think the average postMBA income (the
average being taken over the 60 students in the cohort) is below $96?
► Finally, the interpretation of the kpredint interval: 90% of the observations on postMBA will be within the kpredint
interval, 5% of the observation on postMBA income will be to the left of the kpredint interval (below the lower bound)
and the remaining 5% of the observations on postMBA income will be to the right of the kpredint interval (above the
upper bound):
 90% of 60, that is 54, will have postMBA income within $80 to $116
 5% of 60, that is 3, will have postMBA income below $80
 5% of 60, that is 3, will have postMBA income above $116.
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 9
assignment four - solutions
Managerial Economics &
Decision Sciences Department
Developed for
dummy variables
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Yourself
Figure 8. Results for regression of postMBA on preMBA, school and schoolpreMBA
predicted | std.er of est.mean.
CILow CIHigh
PILow PIHigh
---------------------------------------------------------------98.171 |
1.70426 96.868 99.474
79.71 116.63
about
averages of samples
about
individuals
kpredint
klincom
116.63
79.71
99.474
96.868
90% of observations
90% of observations
1
2
…
1 cohort
60
1
2
…
60
…
1
18 cohorts
2
…
60
1
2
…
1 cohort
these are cohorts
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
60
1
2
3 individuals
3
4
…
57
58
59
54 individuals
60
3 individuals
these are individuals
assignment four | page 10
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Your Employer
Regression 3: regression estimation. The estimated regression and STATA results are shown below:
Est. E[billing]  44.1300  9.0681·experience  68.43·MBA  1.4317experienceMBA
b0
b1
b2
b3
Figure 9. Regression results
billing |
Coef.
Std. Err.
t
P > |t|
---------------------------------------------------------------experience |
9.0681
.4516
20.08
0.000
MBA |
68.43
22.73
3.01
0.003
experienceMBA | -1.4317
.6167
-2.32
0.022
_cons |
44.13
15.43
2.86
0.005
► Estimation based on regression: for two years of experience (experience = 24 months) we get
 with MBA:
Est. E[billing]  44.1300  9.0681·24  68.43·1  1.4317·24·1  295.8336
 with no MBA:
Est. E[billing]  44.1300  9.0681·24  68.43·0  1.4317·24·0  261.7644
► The extra value, i.e. change in billing for a change in experience, is really the slope of the regression line once for
MBA 1 and then for MBA  0. If the slopes are different then indeed the extra value is different between MBAs and
non-MBAs. The slope for the case MBA  1 is 1  3 while for the case MBA  0 is 1 thus the difference in slopes
is exactly 3. The test is really for the null that 3  0 vs. the alternative 3  0. The pvalue  0.022 in the regression
table already suggests that the null cannot be rejected at 1%.
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 11
assignment four - solutions
Managerial Economics &
Decision Sciences Department
dummy variables
Developed for
business analytics II
mba for yourself ◄
mba for your employer ◄
Valuing a MBA: For Your Employer
Regression 3: regression estimation. The estimated regression and STATA results are shown below:
Est. E[billing]  44.1300  9.0681·experience  68.43·MBA  1.4317experienceMBA
b0
b1
b2
b3
► For ten years of experience (experience  120) the estimated billing is
 with MBA:
Est. E[billing]  44.1300  9.0681·120  68.43·1  1.4317·120·1  1028.928
 with no MBA:
Est. E[billing]  44.1300  9.0681·120  68.43·0  1.4317·120·0  1132.302
Thus the difference is 103.374 in the favor of a MBA degree holder.
► The prediction is not that reliable as it is really out-of-sample: we are told that all the observations used to estimate
the regression comes from consultants with experience of up to 5 years (experience  60). To assert that the result is
reliable you must assume, or argue, that the same relation between experience and billing continues to hold after the
first 5 years going forward.
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II
assignment four | page 12