percent Serves Time Response Rate

Touch Clarity
Exploration vs Exploitation
Challenge
Leonard Newnham
11-04-2006
1
Introduction
A real world problem
How to participate
The prize
2
What does Touch Clarity do?
Real time optimisation of websites
Personalise a site by targeting relevant content at individuals
Maximise general interest by learning and serving the most
popular content
Founded 2001
40 staff
Offices in London and US
Clients include Lloyds, NatWest, HSBC, The AA, Citizens Bank,
…
www.touchclarity.com
3
A hypothetical example...
4
5
Commercial application of bandit problem
What news story is a visitor most likely to be
interested in? Typically 5 to 20 options.
Clear measure of interest (they click on the link or
buy the product)
Statistics vary a lot. Some sites are high traffic,
some not. Purchases generally are much less
frequent
Current solutions developed in collaboration with
PASCAL (John Shawe-Taylor, Peter Auer, Nicolo CesaBianchi) using a gain-loss algorithm
6
But... nothing is ever straightforward
Response rates vary with time
3 types of variation…
7
Response Rate
Addition of options
Time
8
Response Rate
Synchronised variation
Time
The response rates of some options vary in sync.
The relative rankings are constant.
9
Response Rate
Weekly variation
Mon
Tue
Wed
Thu
Fri
Sat
Sun
Some options are more popular at weekends, eg football
results, weather forecast
10
Response Rate
Daily Variation
04:00
08:00
12:00
16:00
20:00
24:00
Hour of day
Often see peaks at lunchtimes and in the evenings
11
Response Rate
Unsychronised variations
Time
The popularity ranking of two options changes over
time.
12
Response Rate
News stories
Time
Sudden interest may be generated by a new story.
13
Response Rate
Response Rate
percent Serves
Response Rate
Bandit algorithms
Time
Time
14
Time
Challenge
We provide a test engine, containing several timedependent pdfs.
The user chooses an option A,B,C,D,E
The test engine will return true [with a probabilty
Px(t)] or false
Time is approximated by number of serves, not real
time
15
Challenge
Provide several configurations of the test engine, to
break the problem down into steps
rates changing relative to each other
overall modulation, with no relative shifts
sudden changes
different periodic behaviours
everything all in together
16
Response Rate
Inside the Test Engine
Time
A mixture of the possible behaviours
17
How is the test data provided?
Challenge.jar: Stand-alone java application with an API
Matlab-code.zip: Also as Matlab routine
Java-code.zip : includes ApplicationTest - implementation
of some simple bandit algorithms including random
baseline – Peter Auer
Entrants should write a program which will repeatedly
choose from a range of possible display options, and
query the visitor simulation engine through:
public boolean selectOption(int optionNo)
Available at:
http://www.pascal-network.org/Challenges/EEC/Datasets/
18
Some boundary conditions
There is no fixed number of serves – in our real
world problem we face the challenge of continuous
optimisation with no fixed end date
To prevent the engine from repeating the same
behaviour every time it is restarted, the response
rate functions are parameterized. On restart, new
values for these parameters are randomly generated
based on a seed.
During the judging phase of the challenge the
winner will be the entrant who obtains the highest
response rate over the tasks.
19
First Three tasks
The challenge is broken into a number of tasks each relating to
a particular problem that is frequently encountered in real
website data.
Task 1: rates varying coherently - ie an overall change in
response rate of all options, with their relative performance
unchanged (SimulatedVisitorCoherentVariation)
Task 2: Response rates varying independently
(SimulatedVisitorSimple)
Task 3: Different time dependent functions, including periodic
and nonperiodic (SimulatedVisitorMixture)
20
The Prize
There is a prize of £1000 (E1400) for the winning
algorithm
21