Model Answers
Assignment 3
STATS 747, 2nd Semester 2002
Question 1.
Your client is a facial tissues brand manager who is wondering what mix of advertising and
promotional activity would be appropriate, and wants to have a better understanding of the research
methods available for measuring the effectiveness of these marketing activities. Explain the research
methods typically used to evaluate these activities, comparing and contrasting the techniques for
evaluating advertising effectiveness with those for evaluating promotional effectiveness.
Your answer should cover at least the following aspects of these methods.
i) the data sources or data collection methods generally used;
ii) the statistical techniques that are typically employed, including their inputs, outputs and general
interpretation; and
iii) the time usually needed for these marketing variables to show clear effects.
Assume that your client has a basic understanding of statistics (e.g. they know what regression analysis
means) and a good background in market research, but has not been exposed to these techniques before.
Answer:
Introduction
To get the best return on your marketing budget, it is crucial to understand how advertising and
promotional activities affect the success of your brand. (Promotions include all in-store activity, such as
price discounts and end-aisle displays, and closely related activities such as samples and coupons, while
advertising refers to communications in mass media such as newspapers and television.) These
activities rarely operate in isolation and are usually affected by outside influences such as competitor
activity, which makes it difficult to establish exactly what effect they are having. However several
research methods have been developed to measure the effectiveness of advertising and promotional
activities; these are described and compared below, including the data sources required.
Advertising Effectiveness
Advertising effectiveness has been approached from a wide variety of perspectives. Which is
appropriate depends primarily on the communication objectives of the advertising campaign. The most
obvious approach is to look for sales increases due to an advertising campaign. However, due to
competitive pressures and the difficulty of changing people’s consumption habits, direct effects on sales
are inconsistent and usually small, at least in the short-term. Communication objectives are often set
regarding other outcomes, including ad recall, brand awareness and brand image.
Perhaps the most common method of measuring advertising effectiveness is modelling ad awareness
over time through adstock modelling. Data on ad awareness (or other criteria such as brand awareness)
is gathered through weekly tracking surveys. Typically these have a fairly low sample size, perhaps 50
interviews per week, so awareness is not measured very precisely at a weekly level. However trends in
awareness (such as those due to advertising) can become apparent over a number of weeks. Adstock
modelling involves relating the level of ad recall to the amount of recent advertising exposures, known
as the adsstock. Adstock is calculated as the exponentially weighted moving average of Target
Audience Rating Points (or TARPs). Typically ad awareness will relate much more closely to adstock
than to the current level of TARPs. By looking at the strength of this relationship, the effectiveness of
the advertising in increasing ad awareness can be measured. Variation in the strength of the relationship
can indicate whether an advertisement is wearing out, or (in combination with some media
experimentation) whether some media are more effective for the current campaign than others.
Sometimes a particular advertisement will be studied in detail, through pre-testing or ongoing
monitoring of consumers’ response to the ad. For example, surveys can track the likeability or
familiarity of an ad over time, to provide an indication of when wear-out begins to set in. Similarly,
consumers’ attitudes to the brand can be examined before and after exposure to the ad. [Note: This was
not covered during lectures.] A large amount of effort also typically goes into media planning, using
reach and frequency models to ensure the advertising is placed so as to reach the target audience in the
most cost-effective way. However this is less directly related to the measurement of advertising
effectiveness in terms of brand communication objectives than the other techniques discussed above.
Promotional Effectiveness
There are also many methods for evaluating promotional effectiveness, of varying complexity. Perhaps
the simplest method is to try out a particular promotional activity, and compare weekly sales before the
activity is trialled with sales during the activity (and afterwards). However this approach requires a
large commitment, affecting the marketing of the brand for a significant period, with associated
potential sales losses. It can also be difficult to distinguish the effects of this promotional activity from
other factors affecting sales, including competitors’ marketing efforts and seasonal sales trends. Some
of these problems can be overcome by trialling the activity in a test market only, and comparing sales
with those in similar control market. However this also requires a substantial marketing commitment. A
similar approach on a smaller scale is to identify a test group of stores and a matched control group, and
trial the promotion on just the test stores. Naturally this requires the ability to control promotional
activity at the store level, as well as access to store-level data to evaluate the results.
Due to these problems with the experimental approaches described above, observational methods have
found favour. These methods include econometric modelling of sales figures and choice modelling of
household scanner panel data.
Econometric modelling (also known as time series analysis) utilises weekly sales figures, and combines
these with details of relevant promotional activity to measure promotional effectiveness. The
promotional data includes data on relevant price changes, displays, samples and other specials (e.g. buy
three, get one free). Both these types of data are regularly collected by market research companies such
as ACNielsen. A seasonal ARIMA model is typically used (after removing any trend in sales) to
estimate promotional price elasticity, cross-elasticities between brands, and the effects of other
promotional variables such as the presence of displays. Market-level sales data is often used, although
store-level data usually gives more accurate results.
Market research companies also collect purchasing data through household scanner panels, where each
selected household records their purchases using a bar-code scanner. This provides a much richer
source of information, allowing us to look at the purchasing behaviour of individual households and
how this changes over time. Promotional effects can be estimated by fitting a multinomial logit model
to this data.
Comparison
Promotions primarily have a short-term effect, changing purchasing behaviour while the promotion is in
place, but not far beyond this period (although there is some evidence that price elasticities can be
changed by long-term patterns of promotional activity, and promotions can increase market share by
encouraging consumers to trial a new product). Dramatic promotional effects can be seen to occur very
quickly, often in the first week of the promotion.
In contrast, advertising typically does not cause dramatic changes in the short-term (i.e. within a couple
of weeks), and often does not have a discernable impact on sales. However advertising does appear to
have a sustained effect on brand awareness and image, which can sometimes be sustained through
relatively low levels of advertising.
Summary
The techniques discussed above provide a variety of methods for measuring the effectiveness of the
promotional activity and advertising for a brand, and thus ensuring that the marketing budget is spent
wisely. Promotions and advertisements have very different effects, and require different methods to
measure their effectiveness. But both play an important role in maximising the profitability of a brand,
by stimulating purchases and communicating the essence of the brand.
Question 2.
The table below shows the observed exposure distribution across four issues of the New Zealand
Womans Weekly.
Number of issues read
0
1
2
3
4
Number of people
1487
722
241
96
454
i) Fit a beta-binomial model to this data using maximum likelihood. Plot and interpret the resulting beta
distribution for the individual probabilities.
Answer:
The beta-binomial model assumes that that each person has a probability p of reading each issue, and
that their exposure to each issue is independent, so the number of issues read for each person has a
binomial distribution.
Also individuals’ probabilities of reading p are assumed to follow a beta distribution:
1
1
g1 ( p )
p 1 1 p
B ,
So the likelihood of an individual reading x issues out of n is:
n
n x
L i ( x , , n) p x 1 p g1 p dp
x
n 1
x n x
x 1n x 1
n
n B x, n x
B ,
x
where B , , denotes the beta function.
The log-likelihood of observing this particular exposure distribution f = (f0, f1, f2, f3, f4) = (1487, 722,
241, 96, 454) when n = 4 is:
4
l ( f ) f x log Li ( x , , n 4)
x 1
4
4 1
x 4 x
f x log
4
x 1
x 14 x 1
4
4 B x,4 x
f x log
x
B
,
x 1
We fit the beta-binomial model by maximising this function of the parameters and , yielding
parameter estimates ˆ 0.227 and ˆ 0.676 . The corresponding beta distribution for individuals’
issue reading probabilities is plotted below.
4
3
1
2
Value of Density Function
5
6
Beta Distribution for Probabilities of Reading Each Issue
Initial Beta-Binomial Model
0.0
0.2
0.4
0.6
0.8
1.0
Chance of Reading Each Issue
This shows that the vast majority of people have a low chance of reading each issue. For example, 45%
of people read less than 1 issue out of 10 on average. The distribution is highest near zero, then
decreases sharply and flattens off between about 0.2 and 0.9. It then increases fairly sharply near 1,
showing that there is a noticeable group of people who read almost every issue. Specifically, 7.6% of
people appear to read at least 9 out of 10 issues on average.
R code:
# Observed NZWW exposure distribution
nzww<-c(1487,722,241,96,454)
names(nzww)<-0:4
nzww
# beta-binomial probability distribution
# note that beta refers to either the parameter or the beta function depending on
context
dbb <- function(x,alpha,beta,n) {
choose(n,x)*beta(alpha+x,n+beta-x)/beta(alpha,beta)
}
# log-likelihood
lbb <- function(alpha,beta,frequencies) {
x <- 0:4
sum(frequencies*log(dbb(x,alpha,beta,4)))
}
# Maximise log-likelihood
optim(c(1,1),function(param) {-lbb(param[1],param[2],nzww)})
# Maximum achieved at alpha=0.2772256 and beta=0.6762050
# Plot issue reading distribution
plot(0:100/100,dbeta(0:100/100,0.2772256,0.6762050),type="l",
main="Beta Distribution for Probabilities of Reading Each Issue \nInitial BetaBinomial Model",
xlab="Chance of Reading Each Issue",ylab="Value of Density Function")
# Calculate tail probabilities
pbeta(0.1, 0.2772256,0.6762050)
pbeta(0.9, 0.2772256,0.6762050,lower.tail=FALSE)
ii) Plot the observed and expected frequency counts, and describe any deviations from the fitted model.
Is there any indication of systematic model failure?
Answer:
The expected frequency counts are compared against the observed counts in the following plot:
Expected vs Observed Counts for Beta-Binomial Model
1000
800
600
0
200
400
Number of People
1200
1400
Observed
Expected
0
1
2
3
4
Number of Issues Read (out of 4)
Although none of the counts are predicted precisely, the largest discrepancies are for the number of
people reading 1 and 3 issues. Too many people are expected to read 3 issues out of 4, while the
model’s prediction of the number of people reading 1 issue falls short by roughly the same amount.
In general, the beta-binomial has produced a flatter distribution from 1 up to 4 issues read than the
observed pattern. The observed counts decline steadily from 1 up to 3 issues, then increase substantially
for reading all 4 issues. The beta-binomial model shows systematic failure, since it does not reflect this
pattern of declining numbers of people reading 1 up to 3 issues (out of four issues) of the New Zealand
Womans Weekly.
R code:
barplot(rbind(nzww,dbb(0:4,0.2772256,0.6762050,4)*sum(nzww)),beside=TRUE,main="Expe
cted vs Observed Counts for Beta-Binomial Model", xlab="Number of Issues Read
(out of 4)",ylab="Number of People", legend.text=c("Observed","Expected"))
iii) Fit a modified probability model to the same data, again using maximum likelihood, assuming that
the each person’s number of exposures is a binomial random variable but that the exposure probabilities
have a beta distribution mixed with a point mass at 1. That is, assume a proportion w of people read
every issue of the New Zealand Womans Weekly, and the remainder read each issue independently with
probabilities following a beta distribution. Interpret the resulting model parameters, comparing this beta
distribution with that from part (i), and plot the observed and expected frequency counts.
Answer:
In this modified model, a proportion w of people read every issue. Other individuals’ probabilities of
reading p are assumed to follow a beta distribution. So the combined density function can be written as:
1
1
g 2 ( p) w1 p 1 w
p 1 1 p
B ,
w1 p 1 wg1 p
where 1 p stands for a unit point mass at 1.
The likelihood of an individual reading x issues out of n is:
n
n x
L i ( x , , n) p x 1 p g 2 p dp
x
n
n x
p x 1 p w 1 p 1 wg1 p dp
x
n
n
n x
n x
w p x 1 p 1 p dp 1 w p x 1 p g1 p dp
x
x
n B x, n x
wI 4 x 1 w
B ,
x
where B , , denotes the beta function and I 4 x is an indicator function taking the value 1
only when x = 4 (and being zero otherwise).
The log-likelihood of observing this particular exposure distribution f = (f0, f1, f2, f3, f4) = (1487, 722,
241, 96, 454) when n = 4 is:
4
l ( f ) f x log Li ( x , , n 4)
x 1
3
4 B x,4 x
4 B 4,4 4
1 w f x log
f 4 log w 1 w
x
4
B
,
B
,
x 1
ˆ 0.146 , ˆ 1.007 and
Using maximum likelihood to fit this model gives parameter estimates of w
ˆ
5.636 . This means that almost 15% of people read every issue, while the remaining people have
issue reading probabilities that follow the beta distribution shown in the following plot.
3
2
0
1
Value of Density Function
4
5
Beta Distribution for Probabilities of Reading Each Issue
Modified Beta-Binomial Model
0.0
0.2
0.4
0.6
0.8
1.0
Chance of Reading Each Issue
This beta distribution has quite a different shape to that underlying the simple beta-binomial model. It
rises very sharply, peaking when p=0.0015, and then fading away to near zero before p=0.8. This means
that practically the only people who read more than 9 out of 10 issues (on average) are the 15% of
people who read every issue. This contrasts with 7.6% of people under the simple beta-binomial model
Based on the shape of this beta-binomial distribution, we might expect that it will reflect the declining
numbers of people reading 1 up to 3 issues better than the simple beta-binomial model. The following
plot compares the observed and expected frequency counts.
1400
Expected vs Observed Counts for Modified Beta-Binomial Model
800
600
400
200
0
Number of People
1000
1200
Observed
Expected
0
1
2
3
Number of Issues Read (out of 4)
4
This plot shows that the modified model exhibits a much better fit than the simple beta-binomial model.
In particular, it does not suffer from the same systematic failure from 1 to 3 issues read.
R code:
# log-likelihood for modified beta-binomial
lmbb <- function(w,alpha,beta,frequencies) {
x <- 0:4
if (w<0) {
return(-Inf)
} else {
sum(frequencies[1:4]*log((1-w)*dbb(x[1:4],alpha,beta,4))) +
frequencies[5]*log(w+(1-w)*dbb(x[5],alpha,beta,4))
}
}
# Maximise log-likelihood
optim(c(0.5,1,1),function(param) {-lmbb(param[1],param[2],param[3],nzww)})
# Maximum achieved at w=0.1463773, alpha= 1.0069125 and beta=5.6363898
# Plot issue reading distribution
plot(0:1000/1000,dbeta(0:1000/1000,1.0069125,5.6363898),type="l",
main="Beta Distribution for Probabilities of Reading Each Issue \nModified
Beta-Binomial Model",
xlab="Chance of Reading Each Issue",ylab="Value of Density Function")
# Calculate tail probabilities
(1-0.146)*pbeta(0.1, 1.0069125,5.6363898)
0.146+(1-0.146)*pbeta(0.9, 1.0069125,5.6363898,lower.tail=FALSE)
# Plot observed vs expected counts
barplot(rbind(nzww,
((1-0.1463773)*dbb(0:4,1.0069125,5.6363898,4)+c(0,0,0,0, 0.1463773))
*sum(nzww)),beside=TRUE,main="Expected vs Observed Counts for Modified BetaBinomial Model", xlab="Number of Issues Read (out of 4)", ylab="Number of
People", legend.text=c("Observed","Expected"))
iv) An advertiser plans to place an ad in each of the next ten issues of the New Zealand Womans
Weekly. Use your model from part (iii) above to predict the number of people who would read 0, 1, 2,
3, …, 9, and 10 issues of the magazine, during this 10 week advertising campaign.
Answer:
The expected exposure distribution is shown in the following plot and table. This reflects the shape of
the fitted beta distribution shown above, as well as proportion of people who read every issue.
400
200
0
Number of People
600
800
Expected Exposure Distribution for Modified Beta-Binomial Model
Number of Issues Read (out of 10)
Number of
issues read
(out of 10)
0
1
2
3
4
5
6
7
8
9
10
Expected
number of
people
917
631
418
265
160
90
47
22
9
3
440
R code:
prednzww
<((1-0.1463773)*dbb(0:10,1.0069125,5.6363898,10)
+
c(rep(0,10),
0.1463773))*sum(nzww)
names(prednzww) <- 0:10
t(prednzww)
barplot(((1-0.1463773)*dbb(0:10,1.0069125,5.6363898,10)+c(rep(0,10), 0.1463773))
*sum(nzww),main="Expected Exposure Distribution for Modified Beta-Binomial
Model", xlab="Number of Issues Read (out of 10)", ylab="Number of People")
© Copyright 2026 Paperzz