Summary SKO Data Integration - Calibration

MEMO
TO
FROM
SUBJECT
DATE
GOAL:
Interested parties
SKO
Summary Online calibration
June 2016
Summary data integration for online ratings of programs and commercials - calibration
GOAL
SKO aims to draw conclusions about online video behaviour. In order to do this, we measure viewing behaviour
regarding online video content and commercials on every kind of connected device.
This online viewing behaviour is measured through an online panel, which yields information about user profiles
and reach. On top of that, we measure online viewing behaviour in the players/devices themselves. The latter
measurement yields total census volumes about the number of starts and the amount of viewing time (in
minutes) that has been realised. The data from the online panel and the census is then combined through data
integration (or ‘calibration’), resulting in new ratings for online programs and online commercials. This type of
data integration is represented in figure 1; below, we discuss it in more detail. Kantar performs this data
integration on behalf of SKO.
Figure 1. Data integration of online panel and online census data
DATA INTEGRATION FOR RATINGS ONLINE PROGAMS AND COMMERCIALS - CALIBRATION
The online panel is a sample. Because of this, the results will differ from the actually realised viewing volumes
measured through the devices themselves. To correct this, the panel results are calibrated with the actual census
volumes, a process that consists of a number of steps:
PAGINA 1 VAN 3

Modelling of reach
Online programs:
For online programs, both the reach and viewing time is modelled. To start with, the viewing time registered
through the panel is compared with the actually realised viewing volume in the devices and adjusted accordingly.
The same is done for reach, based upon a more complex variant of the negative binomial reach model. This
method is used for every combination of program, device and time frame.
Example: the online panel yields a laptop viewing time of 80.000 minutes for the program Boer zoekt Vrouw on
Sunday evening. In the census, a total of 100.000 individual level minutes are found for this program-device
combination. To account for this, the panel volume is corrected to 100.000, i.e. the actual viewing time or census
volume.
Online commercials:
For online commercials, the reach, number of starts, average playout frequency and percentages completed or
the degree to which online commercials are played out (in quartiles: 25%, 50%, 75%, 100%) are modelled. For the
number of starts, the realised starts in census are used to correct the panel. The same negative binomial reach
model is applied to determine the daily reach and playout frequency of online commercials.
In correcting, the following elements are taken into account for both the online programs and commercials:
Co-viewing distribution
o
The probability is determined of several persons in the online panel watching the same device at the same time.
The co-viewing distribution on the basis of the panel is then applied on the total census volume.
Online programs:
For instance, when 2 people watch Goed Tijden, Slechte Tijden on a single tablet, the total tablet volume for GTST
is doubled, because twice the amount of individual viewing time in minutes was realised.
Online commercials:
If on the basis of the panel data there is not enough information available about co-viewing for a particular
commercial on a device, then panel data for programs and commercials will be used for the co-viewing
combination of broadcaster(s) and devices. For example, for a commercial ABC which was shown on two
broadcasters B1 and B2 on a tablet, it was observed in the panel that the co-viewing distribution was 80% soloviewing, 10% with two viewers and 10% with three or more viewers. If for that commercial 4500 starts were
measured in the census, the device data are transformed into individual data according to the following:
(4500 starts * 80% solo-viewing) + (4500 starts * 10% co-viewing two people) + (4500 starts * 10% co-viewing three
or more people) = in total 5850 starts corrected for co-viewing
In this example, the 4500 starts measured on the tablet are corrected into 5850 starts, taking into account the
proportions of co-viewing found in the panel.
o
Account for demographic distribution of viewers (Search Net algorithm)
In order to allocate census viewing volume to online panel members, the demographic distribution of online
viewers as observed in the online panel is accounted for. An algorithm is used to determine the sample size
needed to draw reliable conclusions about demographic distribution.
PAGINA 2 VAN 3
Online programs:
Assume that 100.000 minutes were realized by personal computer viewers for the eight o’clock news on Monday.
The online panel tells us that viewers for this program/time frame/device have the following demographic
distribution:
-
6-19 yr male = 10%
6-19 yr female = 15%
20-55 yr male = 25%
20-55 yr female = 15%
55+ yr male = 15%
55+ yr female = 20%
It is possible that the base for this distribution is too small to correctly represent the eight o’clock news on
Monday, viewed on the personal computer. If this is the case, an algorithm is used to incorporate data about the
eight o’clock news for the past 7 days (or a longer period, if 7 days are insufficient). This process continues until
the sample is sufficiently big enough to draw reliable conclusions. Once this is the case, the above distribution is
used – for instance 25.000 minutes for males between 20 and 55 years of age (25% of 100.000 minutes).
Online commercials:
For online commercials a similar method is used to search for sufficient observations in order to draw reliable
conclusions about the demographic profile of viewers for online commercials. For commercials, only within
publisher is searched for panel observations.
o
...and viewing behaviour between programs (Basket Analysis)
The viewing behaviour measured in the panel is used to determine the probability that people who watch a
certain program, also watch another particular program. This probability is taken into account when allocating
census data to the panel. As such, the consistency of individual viewing behaviour is kept intact. For instance,
when viewers who watch GTST on a tablet also often watch Danny Lowinski, this relation between programs is
taken into account.
For online commercials, basket analysis is not applied.
o
Virtual panel expansion: online panel
In order to be able to allocate the (very granular) census data to the (much smaller) online panel, we need to
virtually expand the online panel. To do this, the panel is made 10 times bigger; on an average 10 copies are made
of each individual panel member. After expansion, the online panel of 5000 people thus consists of 50.000 people.
Doing so allows a higher level of detail or granularity, so that the steps mentioned above can be taken into
account (e.g. shared viewing, demographic distribution, viewing behavior consistency between programs, etc).
o
Allocation of viewing/Target implementation
Finally, the online panel is enriched with realistic monthly and daily reach per program/commercial and device.
The weighted totals of the online panel are now aligned with the census totals measured on the devices. The
results are the ratings for online programs and online commercials.
PAGINA 3 VAN 3