An Adaptive Algorithm for Selecting Profitable Keywords for Search

An Adaptive Algorithm for Selecting Profitable Keywords
for Search-Based Advertising Services
Paat Rusmevichientong
David P. Williamson
Cornell University
Cornell University
[email protected]
[email protected]
ABSTRACT
Increases in online search activities have spurred the growth
of search-based advertising services offered by search engines. These services enable companies to promote their
products to consumers based on their search queries. In
most search-based advertising services, a company selects
a set of keywords, determines a bid price for each keyword,
and designates an ad associated with each selected keyword.
When a consumer searches for one of the selected keywords,
search engines then display the ads associated with the highest bids for that keyword on the search result page. A company whose ad is displayed pays the search engine only when
the consumer clicks on the ad. With millions of available
keywords and a highly uncertain clickthru rate associated
with the ad for each keyword, identifying the most effective
set of keywords and determining corresponding bid prices
becomes challenging for companies wishing to promote their
goods and services via search-based advertising.
Motivated by these challenges, we formulate a model of
keyword selection in search-based advertising services. We
develop an algorithm that adaptively identifies the set of
keywords to bid on based on historical performance. The
algorithm prioritizes keywords based on a prefix ordering –
sorting of keywords in a descending order of profit-to-cost
ratio. We show that the average expected profit generated
by the algorithm converges to near-optimal profits. Furthermore, the convergence rate is independent of the number of
keywords and scales gracefully with the problem’s parameters. Extensive numerical simulations show that our algorithm outperforms existing methods, increasing profits by
about 7%.
1.
INTRODUCTION
Search-based advertising services offered by search engines
enable companies to promote their products to targeted
groups of consumers based on their search queries. Examples of search-based advertising services include Google Adwords (http://www.google.com/adwords), Yahoo! Spon-
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.
sored Search (http://searchmarketing.yahoo.com/srch/
index.php), and Microsoft’s forthcoming MSN AdCenter
(http://advertising.msn.com/searchadv/). In most searchbased advertising services, a company wishing to advertise
sets a daily budget, selects a set of keywords, and determines
the bid price for each selected keyword. When consumers
search for one of the selected keywords, search engines then
display the ads associated with the highest bids for that
keyword on the search result page. A company whose ad
is displayed pays the search engine only when the consumer
clicks on the ad. Note that the ad is not displayed if the
company’s spending that day has exceeded its budget. Figure 3 in Appendix A shows examples of the ads shown to
a user under the search-based advertising programs offered
by Google and Yahoo!.
Details of the actual cost of each keyword, of how a search
query matches to a keyword, and of how different bid amounts
translate to the position on the search result page vary from
one search engine to another. Under the Google Adwords
program, for instance, each click will cost the ad’s owner an
amount equals to the next highest bid plus one cent. Moreover, when determining the ad’s placement on the search result page, the Google Adwords program takes into account
not only the bid amount, but also the ad’s popularity and
clickthrus. On the other hand, under Yahoo! Sponsored
Search program, the highest bidder receives the top spot.
With millions of available keywords, a user of search-based
advertising services faces challenging decisions. The user
not only has to identify an effective set of keywords, she
also needs to develop ads that will draw consumers and to
determine a bid price for each keyword that balances the
tradeoffs between the costs and profits that may result from
each click on the ad associated with the keyword. Furthermore, much of the information that is required to compute
these tradeoffs is not known in advance and is difficult to
estimate a priori. For instance, the clickthru probability
associated with the ad for each keyword – the probability
that the consumer will click on the ad once it appears on
the search result page – varies dramatically depending on
the keyword, the ad itself, and the position of the ad on the
search result page.
In this paper, we consider one of the challenges faced by
users of search-based advertising services: given a fixed daily
budget and unknown clickthru probabilities, develop a policy that adaptively selects a set of keywords in order to
maximize total expected profits. We assume that we have
N keywords indexed by 1, 2, . . . , N . For any A ⊆ {1, . . . , N },
let the random variable ZA denote the profits from selecting
keywords in the set A. A policy φ = (A1 , A2 , . . .) is a sequence of random variables, where for all t, At denotes the
subset of keywords that will be selected in period t; At is
allowed to depend on the observed outcomes in the previous t − 1 periods. For any T ≥ 1, we aim to find a policy
φ = (A1 , A2 , . . .) that maximizes the total expected profits
PT
over T periods, i.e.
t=1 E [ZAt ] .
Identifying profitable keywords requires knowledge of the
clickthru probability of the ad associated with each keyword.
Unfortunately, the clickthru rate is generally not known in
advance and we can obtain an estimate of its value only by
selecting the keyword and observing the resulting displays
of the ad – commonly referred to as its impressions – and
the number of clicks on it. Since we pay for each click on the
ad, this process can potentially result in significant costs, yet
may offer an opportunity to discover potentially profitable
keywords. We thus must balance the tradeoffs between selecting keywords that seems to yield high average profits
based on past performance and selecting previously unused
keywords in order to learn about their clickthru probabilities. This is usually cast in the machine learning literature as
balancing “exploitation” (of known good options) and “exploration” (of unknown options that might be better than
known options).
We model the problem as follows: we assume that in every time period, some number of queries arrive, where the
number is independent and identically distributed. At the
beginning of each time period, we must specify a subset of
keywords on which we will bid. Then queries arrive sequentially; keyword i is queried with probability λi . If we have
enough money remaining in our budget, our ad is displayed.
With some probability pi , the user clicks on our ad; we receive profit πi and must pay cost ci . We assume that the
probabilities λi , costs ci , and profits πi are known; justification of our ability to know the first two quantities from
publicly available sources is given in Section 2.1. The probabilities pi are unknown, and we learn them over time. Our
goal is to give an algorithm that converges to near-optimal
profits as the number of time periods tends towards infinity.
We begin by considering the static case – when the probabilities pi are known. This problem is related to the wellknown stochastic knapsack problem and is NP-complete.
Following the approximation algorithms of Sahni [23] for the
standard knapsack problem and the nonadaptive algorithm
of Dean, Goemans, and Vondrak [7] for the stochastic knapsack problem, we show that we can obtain a near-optimal approximation algorithm by considering prefix-orderings. We
order the keywords in decreasing order of profit-to-cost ratio;
that is, π1 /c1 ≥ π2 /c2 ≥ · · · ≥ πN /cN . Then our algorithm
chooses a set of the form {1, . . . , `} for some integer `; we
call such a set a prefix of the keywords. We show that if the
cost of each item is sufficiently small compared to our budget, if the expected number of arrivals of any given keyword
is not too large, and if the expected number of searched
keywords is close to its mean, then in expectation our algorithm returns a near-optimal solution, where the closeness to
optimality depends on how well these assumptions are satisfied. We give evidence that in practice these assumptions
are quite likely to be true.
Note that the static case of our problem is somewhat different than the stochastic knapsack problem. In that problem, profits are known, but sizes are drawn from arbitrary
known distributions and items can be placed in the knap-
sack in a specified order. Here costs (corresponding to item
sizes) are deterministic and known, but query arrival (corresponding to item placement) is random and not under our
control.
Our result in the static case guides the development of an
adaptive algorithm for the case when the clickthru probabilities pi are unknown and we must learn them over time.
In each time period, we either choose a random prefix of
keywords, or a prefix of keywords using our algorithm from
the static case applied to our estimates of the pi from the
last time period. We show that, averaged over the number
of time periods T , in expectation our algorithm converges to
near-optimal profits, where the performance guarantee is almost the same as that of the static case modulo some terms
that vanish as T goes to infinity.
This result should be compared to traditional multi-armed
bandit algorithms [3, 4, 8, 15, 16, 17]. These algorithms
must choose one of N slot machines (or “bandits”) to play
in each time step. Each play of a bandit will yield a reward, whose distribution is not known in advance. Several
algorithms have been proposed for the multi-armed bandit
problem whose average expected reward also converges to
the optimal. If we associate each bandit with a prefix of the
keywords, we can show that in expectation these algorithms
converge to near-optimal profits with the same performance
guarantee as our static case. The convergence rates of these
algorithms, however, depend on the number of possible keywords N , which might be extremely large in our case. In
contrast, the convergence rate of our prefix-based adaptive
algorithm is independent of N ; it instead depends primarily
on the value of the largest ` such that the expected cost of
the prefix {1, . . . , `} does not exceed our budget. We believe
that in practice the value of ` will be significantly smaller
than the total number of keywords N . Furthermore, our
dependence on T is much better than the multi-armed bandit algorithms, so that our algorithm converges faster. Finally, when compared to two standard multi-armed bandit
algorithms, extensive numerical simulations show that in as
little as 30 time periods our algorithm outperforms these
two algorithms, yielding significantly better profits. In one
set of simulations, our profits were about 7% better, and in
another about 20% better.
To the best of our knowledge, this work is the first publicly available study that addresses the problem of identifying profitable sets of keywords, realistically taking into
account the uncertainty of the clickthru probability and the
need to estimate its value based on historical performance.
While there has been a great deal of interest from researchers
in the area of search-based advertising services, much of it
has focused on the design of auctions and payment mechanisms [1, 5, 6, 9, 11, 18, 21]. Our paper is one of only a
few that focuses on the users of search-based advertising.
Kitts et al. [12] consider the problem of finding optimal bidding strategies under a variety of objectives, but only in our
static setting in which the clickthru probabilities are known
in advance. We believe that removing this assumption is a
crucial step towards a usable algorithm. Furthermore, Kitts
et al. do not offer an algorithm for finding an optimal set of
keywords when the objective is to maximize expected profit
subject to a budget constraint, which again is needed given
the requirements of using a service such as Google Adwords.
We have attempted to construct a model of the problem
that is simultaneously tractable and realistic, justifying our
choices with real data whenever possible. Our resulting algorithm is simple to state and code, and provides good results
in simulations, beating out several potential competitors.
Our paper is structured as follows. We present our model
in Section 2. We analyze the static case in Section 3. We
discuss the connection of our problem with multi-armed bandits in Section 4.1, then give our algorithm in Section 4.2.
We give the results of an experimental study comparing our
algorithm with multi-armed bandit algorithms in Section 5.
We discuss some possible extensions of our model in Section
6. Because of space constraints, many proofs and supporting
figures have been placed in the appendix.
We conclude our introduction by mentioning some other
related literature. During the past few years, there has been
resurgent interest in the study of online decision making
with extremely large decision spaces [2, 13, 14, 24]. However, the algorithms developed in these papers are problemspecific and they are not applicable to the problem considered here.
We should note that Mehta et al. [19] have considered the
problem faced by a search engine on how to optimally match
incoming queries with the bid placed by advertisers. This
problem is very different from the one studied in this paper;
our focus is on the company trying to advertise rather than
on the search engine providing the advertising service.
CPC associated with each keyword using publicly available
information provided by search engines.
At the beginning of each time period, we start with a balance of B dollars and must select a set of keywords, which
will determine the set of ads that will be shown to consumers. As a search query arrives, if the query matches with
one of the selected keywords and we have enough money remaining in the account, the ad associated with the keyword
appears on the search result page. Each display of an ad
is called an impression. If the consumer then clicks on the
ad, we receive a profit, pay the cost associated with the keyword to the search engine, update our remaining balance,
and wait for an arrival of another search query. The process
terminates once all search queries have arrived for the time
period.
By appropriate scaling, we may assume without loss of
generality that the beginning balance in each time period is
$1. Let {1, . . . , N } denote the set of possible keywords on
which we can bid. Let 1(·) denote the indicator function and
let the random variable BrAt indicate the remaining balance
when the rth search query appears and we have decided
to select keywords in the set At ⊆ {1, . . . , N } in period
t
t. Let Xri
be a Bernoulli random variable with parameter
pi indicating whether or not the consumer clicks on the ad
associated with keyword i during the arrival of the rth search
query in period t. If we select the set of keywords At ⊆
{1, . . . , N } in period t, the expected profit E [ZAt ] is given
by
2.
3
2
St X
N
X
t
A
t
t
πi 1 i ∈ At , Qr = i, Br ≥ ci , Xri = 1 5 ,
E [ZAt ] = E 4
PROBLEM FORMULATION AND MODEL
DESCRIPTION
In this section, we develop a model of search-based advertising services and formulate the problem of identifying
profitable sets of keywords. Sections 2.1 and 6 provide additional discussion of the problem formulation, justification
of the model’s assumptions, and discussion of the publicly
available data sources that can be used to estimate some of
the model’s parameters. Assume that we have a probability
space (Ω, F , P) and N keywords indexed by 1, 2, . . . , N . For
any t ≥ 1, let S t denote the total number of search queries
that arrive in period t. We assume that S 1 , S 2 , . . . are independent and identically distributed random variables with
mean 1 < µ < ∞ and the distribution of S t is known in
advance.
We assume that, in each time period, the search queries
arrive sequentially, and each query can correspond to any
one of the N keywords. For any t ≥ 1 and r ≥ 1, let the random variable Qtr denote the keyword associated with the rth
search query in period t. We assume that Qtr : t ≥ 1, r ≥ 1
are independent
and
identically distributed random
P variables and P Qtr = i = λi for i = 1, . . . , N, where N
i=1 λi =
1. We assume that λi ’s are known in advance. In Section
2.1, we will show an approach for estimating the distribution of S t and the parameters λi using publicly available
information provided by search engines.
For each keyword 1 ≤ i ≤ N , let 0 ≤ pi ≤ 1, ci > 0,
and πi ≥ 0 denote the clickthru probability, the ad cost
or cost-per-click (CPC), and the expected profit (per click)
of keyword i, respectively. We will assume that the CPC
and the expected profits (ci and πi ) are known in advance
and remain constant. Although the CPC of each keyword
depends on the bidding behaviors of other users, we believe
this requirement is a reasonable model in the short-term.
In Section 2.1, we will provide additional discussion of this
assumption and show how we can approximate the current
r=1 i=1
(1)
t
We assume that, for any i, the random variables Xri
: t ≥ 1,
r ≥ 1) are independent and identically distributed and are
independent of the arrival process.
The above expression for the expected profit reflects our
model of the search-based advertising dynamics. Under this
model, we receive a profit from keyword i during the arrival
of the rth search query in period t if and only if
(a) We bid on keyword i at the beginning of period t, i.e.
i ∈ At ;
(b) The rth query corresponds to keyword i, i.e. Qtr = i;
(c) We have enough balance in the account to pay for the
cost of the keyword if the consumer clicks on the ad, i.e.
BrAt ≥ ci ;
t
(d) The consumer actually clicks on the ad, i.e. Xri
= 1.
As mentioned in the previous section, the clickthru rate
pi for any keyword i is generally not known in advance and
we can obtain an estimate of its value only by selecting the
keywords and observing the resulting impressions and clicks.
Therefore, we must balance the tradeoffs between choosing
keywords that seems to yield high average profits based on
past performance and trying previously unused keywords in
order to learn about their clickthru probabilities.
To analyze the tradeoffs between exploitations and explorations, let us introduce the following notation. Let (Ft : t ≥ 1)
denote a filtration on the probability space (Ω, F , P), where
Ft corresponds to all events that have happened up to time
t. A policy φ = (A1 , A2 , . . .) is a sequence of random variables, where At ⊆ {1, 2, . . . , N } corresponds to the set of
keywords selected in period t. We will focus exclusively on
a class of non-anticipatory policies defined as follows.
Definition 1. A policy φ = (A1 , A2 , . . .) is non-anticipatory
if At is Ft -measurable for all t ≥ 1. Let Π denote the collection of all non-anticipatory policies.
For any T ≥ 1, we aim to find a non-anticipatory policy thatP
maximizes the expected profit over T periods, i.e.
supφ∈Π Tt=1 E [ZAt ] . In general, the above optimization is
intractable and we aim to find an approximation algorithm
that yields near optimal expected profit.
2.1
Model Calibration
Although we aim to estimate adaptively the clickthru probability associated with the ad of each keyword based on past
performance, our formulation assumes that we can estimate
a priori the arrival process, the query distributions, and the
current CPC of each keyword. In this section, we identify
publicly available data sources that might be used to estimate these parameters.
Figure 4(a) in Appendix A shows the estimated number
of search queries in Yahoo! in September 2005 related to the
keyword “cheap Europe trip”. This data is obtained through
the publicly available Keyword Selector tool provided by
the Yahoo! Sponsored Search program (http://inventory.
overture.com/d/searchinventory/suggestion/). Using this
information, it might be possible to estimate the probability
that each query will correspond to a particular keyword.
Our model also assumes that the cost-per-click (CPC) associated with each keyword is known in advance. Although
the actual CPC depends on the outcome of the bidding behaviors of other users, as indicated in Section 2, we believe that this requirement is a reasonable assumption in
the short-term. Whereas the clickthru probability associated with the ad of each keyword can be estimated only by
bidding on the keywords and observing the resulting impressions and clicks, it is possible to estimate the CPC associated
with each keyword using data provided by search engines.
For instance, Figure 4(b) shows an example of the current
outstanding bids (as of 10/18/05) for the keyword “cheap
Europe trip” under the Yahoo! Sponsored Search Marketing program. This information is publicly available through
the Bid Tools offered by Yahoo! at http://uv.bidtool.
overture.com/d/search/tools/bidtool/index.jhtml. Given
the outstanding bids for a keyword, we might approximate
the CPC of each keyword as the median bid price, the kth
maximum bid price, or the average bid price.
follows directly from the general expression given in Equation 1 in Section 2; we simply drop the superscript t denoting the time period. The following lemma allows us to
simplify the objective function of the STATIC BIDDING
problem; its proof appears in Appendix B.
Lemma 1. For any A ⊆ {1, 2, . . . , N }, r ≥ 1, and keyword i ∈ A,
n
o
n
P Qr = i, BrA ≥ ci , Xri = 1S ≥ r = λi pi P BrA ≥ ci S ≥ r
and E [ZA ] =
3.1
P
i∈A
πi λi pi
P∞
r=1
E 1 BrA ≥ ci , S ≥ r
o
.
Computational Complexity and Cost Assumptions
Note that if pi = 1 for all i, then the consumer will always click on the ad. Thus, the problem becomes a variation of the stochastic knapsack problem, which is known to
be NP-complete [7, 10]. Unless P = N P , it is thus unlikely
that there exists a polynomial-time algorithm for solving the
STATIC BIDDING problem. In this section, we identify
special structure of our problem that will enable the development of an efficient approximation algorithm in Section
3.2. Throughout this paper, we will make the following assumptions on the ordering of keywords and their costs.
Assumption 1.
(a) The keywords are indexed such that π1 /c1 ≥ π2 /c2 ≥
· · · ≥ πN /cN .
(b)
PN
i=1
ci pi λi > 1.
(c) There exists k ≥ 1 and 0 ≤ α < 1 such that ci ≤ 1/k
and λi µ ≤ kα for all i.
Before we proceed to the statement of our theorem, let
us first discuss each of these assumptions. The first two
assumptions are without loss of generality. Assumption 1(a)
assumes that the keywords are sorted by a prefix ordering –
in a descending order of profit-to-cost ratio. Moreover, by
adding fictitious keywords with
Pzero profit, we can assume
without loss of generality that N
i=1 ci pi λi > 1, which is the
statement of Assumption 1(b).
Thus our key assumption is Assumption 1(c). This assumption places constraints on the magnitude of the cost
3. A STATIC SETTING: WHEN THE PI ARE
and the expected number of arrivals associated with each
keyword. The requirement that ci ≤ 1/k implies that each
KNOWN
click of the ad associated with keyword i will cost at most
In this section, we will focus on the static setting, where
1/k fraction of the total budget. As demonstrated in Figthe clickthru probability associated with the ad of each keyure 5 in Appendix A, we believe that this assumption is
word is known in advance. The structure of the policy found
a reasonable requirement. Figure 5 shows the distribution
in this section will guide the development of the adaptive
of the median bid price for each of the 842 travel-related
policy in the next section.
keywords. These keywords were chosen based on the auSince we know the clickthru probability associated with
thors’ experience in working with an online seller of travel
the ad for each keyword, the problem reduces to the following static optimization, which we will call STATIC BIDpackages. Examples of the keywords include “hawaii vaDING problem.
cation package”, “africa safari”, and “london discount airfare”. A complete list of 842 keywords used in this study
Z∗≡
max E [ZA ]
A⊆{1,...,N }
∼
# is available at http://www.orie.cornell.edu/ paatrus/
" S N
keywordlist.txt. The data is collected over a 2-day period
XX
= max E
πi 1 i ∈ A, Qr = i, BrA ≥ ci , Xri = 1 .(10/17/05-10/18/05) from the publicly available Bid Tool
A⊆{1,...N }
r=1 i=1
offered by the Yahoo! Sponsored Search program at http://
where E [ZA ] denotes the expected profit from selecting keyuv.bidtool.overture.com/d/search/tools/bidtool/index.
words in A. The above expression of the objective function
jhtml. As seen from the figure, the average median bid
3.2
In addition, for our approximation algorithm to perform
well, the value α should be small. Recall from Assumption
1 that for any keyword i, λi µ ≤ kα . Thus, α measures the
average number of search queries for each keyword i. As
shown in Figure 6 in Appendix A, most keywords have an
average of less than 20 queries per day, leading us to believe
that this requirement should hold for a large proportion of
keywords.
Figure 1 shows the value of the performance bound given
in Theorem 1 (with ρ = 1.0) for various values of α and k,
with k ranging from 103 to 106 . We observe, for instance,
that if k is 1000, α is 0.2, and ρ is 1, Theorem 1 shows that
the algorithm is within 0.68 of the optimal profit.
Performance Bound for Different Values of k and α (ρ = 1.0)
1
An Approximation Algorithm
The main result of this section is stated in the following
theorem, whose proof is given in Section 3.3.
Theorem 1. Let k and α be defined as in Assumption
1. Suppose the probabilities pi are known and 1/k1−α ≤
1 − 1/k − 1/k(1−α)/3 . Let IU be defined by
(
IU = max ` : µ
X̀
ci pi λi ≤ 1 −
1
1
− (1−α)/3
k
k
If ρ = E[min {S, µ}]/µ, P = {1, 2, . . . , IU }, and Z denotes
the optimal profit, then
1
k(1+2α)/3
−
0.8
k = 104
0.7
0.6
k = 103
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
2
k(1−α)/3
0.5
0.6
0.7
0.8
0.9
1
α
.
∗
k = 105
)
i=1
ρ 1−
k = 106
0.9
P er fo r m an ce B o u n d
price for a keyword is $0.30-$0.35. A daily budget of $300$350/day will translate to a value of k around 1000. Of
course, as the daily budget increases, so will the value of k.
The second part of Assumption 1(c) requires that the expected number of search queries for keyword i, λi µ, is at
most kα for some 0 ≤ α < 1. Although the expected number of search queries for a keyword can vary dramatically,
we believe that this assumption should hold for a large proportion of keywords, as shown in Figure 6 in Appendix A.
Figure 6 shows the distribution of the average daily searches
on Yahoo! for the same 824 travel-related keywords during September 2005. This information is obtained through
the publicly available Keyword Selection tool offered by the
Yahoo! Sponsored Search program at http://inventory.
overture.com/d/searchinventory/suggestion/. As seen
from the figure, for over 80% of the keywords, the average
number of daily searches is at most 20 searches per day. If
the value of k is about 1000, this translates to the value of
α of around 0.43.
Figure 1: Plot of the performance bound given in Theorem 1 for various values of k and α (with ρ = 1.0).
Z ∗ ≤ E [ZP ] ≤ Z ∗ .
Before we proceed to the proof of the theorem, let us
discuss the implications of the above result. The above theorem shows that the quality of the prefix-based approximation algorithm depends on two factors. The first factor,
E [min {S, µ}] /µ, measures how tightly the distribution of S
concentrates around its mean. In the most important example where S follows a Poisson distribution, this ratio can be
computed explicitly and it is given in the following lemma,
whose proof appears in Appendix C.
Lemma 2. If S is a Poisson random variable with mean
√
µ > 1, then 1 − 3/ µ ≤ E [min {S, µ}] /µ ≤ 1.
3.3
Proof of Theorem 1
The proof of Theorem 1 relies on the following lemmas.
The first lemma establishes an upper bound on the optimal
expected profit in terms of the optimal value of a linear
program.
Lemma 3. Let Z LP be defined by
(
Z LP ≡ max
N
X
i=1
∗
Then, Z ≤ Z
LP
N
X
πi λi pi zi )
ci λi pi zi ≤ 1, 0 ≤ zi ≤ µ, ∀i
i=1
.
Proof. Consider any A ⊆ {1, 2, . . . N }. Let the random
variable CA denote the total cost associated with bidding
on keywords in A. By definition, we know that CA ≤ 1 with
probability 1, which implies that
In most applications, the expected number of search queries
µ will be very large, ranging from hundreds of thousands to
over millions of queries per day. If the arrival follows a Pois" S N
#
XX
son distribution, then we see that the ratio E [min {S, µ}] /µ
A
1 ≥ E [CA ] = E
ci 1 i ∈ A, Qr = i, Br ≥ ci , Xri = 1
will be very close to one.
r=1 i=1
The second factor that influences the quality of our ap∞
h i
X
X
proximation algorithm is the value of k and α. Since 0 ≤
ci λi pi
E 1 BrA ≥ ci , S ≥ r
=
α < 1, as k – the relative value between the budget and
r=1
i∈A
the CPC for each keyword – increases, the expected profit
will be close to the optimal. As indicated in the previous
where P
the last equality
follows from
Lemma
P∞ 1. Note that for
A
section, we believe that this is a reasonable assumption. If
any i, ∞
E
1
B
≥
c
,
S
≥
r
≤
i
r
r=1
r=1 E [1 (S ≥ r)] =
the budget for a campaign ranges from $200-$400 per day
E[S] = µ.
and the maximum CPC for each keyword ranges from 30 to
For anyP
1 ≤ i ≤
let 0 ≤ zi ≤µ be defined by zi =
N, A
50 cents, this translates to a value of k from 400 to 1200.
1 (i ∈ A)· ∞
E
1
B
≥ ci , S ≥ r . Then, we have that
r
r=1
PN
i=1
ci pi λi zi ≤ 1 and also that
"
E [ZA ] = E
S X
N
X
#
πi 1 i ∈ A, Qr = i, BrA ≥ ci , Xri = 1
r=1 i=1
=
X
∞
X
πi λi pi
h E 1 BrA ≥ ci , S ≥ r
i
=
N
X
r=1
i∈A
πi λi pi zi
i=1
where the second inequality follows from Lemma 5. The
third inequality follows from Lemma 2. The final inequality
follows from Assumption 1(c), which implies that, for any
keyword i, µci λi pi ≤ 1/k1−α ; it follows from the definition
P U
of IU that µ Ii=1
ci λi pi ≥ 1 − 1/k − 1/k(1−α)/3 − 1/k1−α .
Since k ≥ 1 and 0 < α < 1, 1/k(1−α)/3 ≥ 1/k1−α , we have
that
Since A is arbitrary, the desired result follows.
1−
The next lemma relates the optimal value Z LP of the
linear program defined in Lemma 3 to the expected profits
obtained when we select keywords according to the prefix
ordering; its proof is in Appendix D.
Lemma 4. Let H = {1, . . . , u} where u is any positive
P
integer such that 0 < µ ui=1 ci λi pi ≤ 1 − 1/k − 1/k(1−α)/3 ,
so that u ≤ IU . Let Z LP denote the optimal value of the
linear program defined in Lemma 3. Then,
X
X
πi λi pi ≥
i∈H
!
ci λi pi
Z
LP
≥
i∈H
X
!
ci λi pi
Lemma 5. Let H = {1, . . . , u} where u is any positive
P
integer such that 0 < µ uj=1 cj λj pj ≤ 1 − 1/k − 1/k(1−α)/3 ;
so that u ≤ IU . Then, for any keyword i ∈ H and 1 ≤ r ≤ µ,
≥
µ P
i∈H ci λi pi
k
2
P
1
−
µ
i∈H ci λi pi
k
1−
1−
1
1 − (1+2α)/3 .
k
≥
Here is the proof of Theorem 1.
Proof. Note that since by hypothesis 1/k1−α ≤ 1−1/k−
1/k(1−α)/3 , it follows that IU is well-defined. It follows from
Lemma
5 that
forany keyword i ∈ P and 1 ≤ r ≤ µ,
P
BrP ≥ ci S ≥ r
E [ZP ]
=
X
≥ 1 − 1/k(1+2α)/3 . Therefore,
πi λi pi
X
πi λi pi
i∈P
≥
=
≥
≥
n
o
n
o
P BrP ≥ ci , S ≥ r
r=1
i∈P
≥
∞
X
1−
bµc
X
P BrP ≥ ci , S ≥ r
r=1
1
k(1+2α)/3
E [min {S, µ}]
µ
E [min {S, µ}]
µ
E [min {S, µ}]
µ
0
@
X
1
bµc
P {S ≥ r}A
r=1
1−
1−
1−
1
k(1+2α)/3
1
k(1+2α)/3
1
k(1+2α)/3
πi λi pi
i∈P
µ
X
πi λi pi
i∈P
Z∗µ
X
ci λi pi
i∈P
−
2
k(1−α)/3
Z∗
1−
1
1
1
1
− (1−α)/3 − 1−α
k
k
k
1−
AN ADAPTIVE SETTING: WHEN THE
PI ARE UNKNOWN
In this section, we consider the case when the clickthru
probabilities are not known in advance. To estimate these
probabilities, we need to select the keywords and observe the
resulting impressions and clicks, a process that can potentially result in significant costs. We thus need to carefully
balance the tradeoffs between explorations and exploitations. In Section 4.1, we will show how this problem can
be formulated as a multi-armed bandit problem, where each
bandit corresponds to a subset of keywords. We then aim to
develop a strategy that adaptively selects bandits that yield
near-optimal profits.
By formulating the problem an instance of the multiarmed bandit problem, we can apply existing adaptive algorithms developed for this class of problems. Unfortunately,
existing multi-armed bandit algorithms have performance
guarantees that deteriorate as the number of keywords increases, motivating us to develop an alternative adaptive approximation algorithm that leverages the special structure
of our problem.
In Section 4.2, we describe our improved algorithm and
show that the algorithm exhibits a superior performance
bound compared with traditional multi-armed bandit algorithms. We will show that the quality of the solution
generated by our algorithm is independent of the number
of keywords and scales gracefully with the problem’s other
parameters. Then, in Section 5, we show through extensive simulations that our algorithm outperforms traditional
multi-armed bandit algorithms, increasing profits by about
7%.
4.1
X
1−
1
2
− (1−α)/3
k
k(1+2α)/3
k
1
2
2
1
≥ 1 − (1+2α)/3 − (1−α)/3 + (2+α)/3 −
k
k
k
k
1
2
≥ 1 − (1+2α)/3 − (1−α)/3
k
k
which is the desired result.
Z .
if the rth search query corresponds to keyword i, given that
we bid on a set of keywords H and r ≤ µ. Since the proof
is fairly technical, the details are given in Appendix E.
P{BrH ≥ ci S ≥ r}
4.
The final lemma
provides a lower bound on the probability
n
o
P BrH ≥ ci S ≥ r that we have enough money remaining
k(1+2α)/3
≥
∗
i∈H
1
A Traditional Multi-Armed Bandit Approach
Theorem 1 shows that by considering only subsets of keywords that follow the prefix ordering – a descending order
of profit-to-cost ratio – we can achieve near-optimal profits.
This result enables us to reduce the size of the decision space
from 2N (all possible subsets of N keywords) to just N .
Assume that the keywords are indexed as in Assumption
1. For any 1 ≤ i ≤ N , let Di = {1, 2, . . . i}. We can view
each Di as a decision whose payoff corresponds to the expected profit that results from selecting keywords in Di , i.e.
E [ZDi ]. Let Z pref ix denote the maximum profit among decisions D1 , . . . DN , i.e. Z pref ix = max1≤i≤N E [ZDi ] .
Viewing each Di as a bandit whose reward corresponds
to the profit from selecting keywords in Di , we can then
formulate our problem as the multi-armed bandit problem.
The multi-armed bandit problem has received much attention and many algorithms exist for this class of problems.
Two such algorithms are the UCB1 (see [3]) and the EXP3
(see [4]) algorithms. Both algorithms adaptively select a
bandit in each time period based on past observations. Let
(Zt (U CB1) : t ≥ 1) and (Zt (EXP 3) : t ≥ 1) denote the sequence of profits obtained by the algorithms UCB1 and
EXP3, respectively. The average expected profit generated
by these algorithms are given by (see [3, 4]):
PT
1 ≥
1 ≥
Zt (U CB1)
L (αN log T + βN )
≥1−
T Z pref ix
T Z pref ix
PT
√
3
3 LN log N
t=1 Zt (EXP 3)
≥1− √
3
T Z pref ix
2T Z pref ix
t=1
PN
PN
is
where L ≡ max
i=1 ci xi ≤ 1, xi ∈ N ∀i
i=1 πi xi – We choose a sequence (γt ∈ [0, 1] : t ≥ 1), where
γt will denote the probability that we choose a
random decision at time t. We will later suggest
good choices for the γt (see the end of this section
and Appendix F).
• DEFINITION: For t = 1, 2, . . .
– Let `t be the index such that
µ
`t
X
X
1
2
cu λu p̂t−1
≤ 1− − (1−α)/3 < µ
cu λu p̂t−1
u
u
k
k
u=1
u=1
`t +1
– Let θt denote an integer chosen uniformly at random from the set {1, 2, . . . , N }
– Let Ft denote an independent binary random variable such that P {Ft = 1} = 1−γt and P {Ft = 0} =
γt .
– Let the decision gt be defined by
an upper bound on
one
PN
Pthe maximum possible profit in any
2
`t ,
if Ft = 1
period and αN = 1≤i≤N :∆i >0 8/∆i and βN = 1 + π /3
i=1 ∆i ,
gt =
pref ix
θ
,
otherwise.
t
and for any 1 ≤ i ≤ N , ∆i = Z
− E [ZDi ].
The above results show that, as T increases to infinity,
– Let Gt = {1, 2, . . . , gt }. Bid on keywords in Gt .
the average profit earned by the UCB1 and EXP3 algorithms converge to the maximum expected profit Z pref ix
– Observe the resulting impressions and clicks.
among D1 , . . . , DN . We know from Theorem 1 that Z pref ix
– Updates: For any keyword i, let Vit and Wit deprovides a good approximation to Z ∗ , suggesting that the
note the number of impressions and clicks that
UCB1 and EXP3 algorithms may provide a viable algokeyword i receives in this period, respectively.
rithms for solving our problem.
Then, for all i,
Unfortunately, these two generic algorithms for the multiarmed bandit problem ignore any special structure of the
yi ← yi + Vit and xi ← xi + Wit
problem, and thus, the rates of convergence of these al
gorithms depend on the number of keywords N . As the
1,
if yi = 0
number of keywords N increases, the rate of convergence
p̂ti =
xi
,
if yi > 0
y
i
decreases as well, clearly an undesirable feature in our setting where we can have millions of keywords. In fact, the
• OUTPUT: A sequence of decisions (Gt : t ≥ 1) and the
U CB1 algorithm requires us to try each of the N possible
corresponding sequence of indexes (gt : t ≥ 1).
decisions once during the first N time periods. Clearly, for
large values of N as in our setting, this might not be feasible.
In the above algorithm, p̂ti represents our estimate, at the
end of t periods, of the clickthru probability of the ad associ4.2 An Improved Adaptive Approximation Alated with keyword i. Note that as long as the ad associated
gorithm
with keyword i has not received any impressions (i.e. we
Motivated by the result from the previous section, we will
have no data on the clickthru probability), we will set p̂ti to
develop an alternative algorithm that exploits the special
1. However, when the ad receives at least one impressions,
structure of our problem. We will show that our algorithm
we set p̂ti as the average clickthru probability.
exhibits a performance bound that scales gracefully with
For ease of exposition, let us introduce the following nothe problem’s parameters and the bound will be indepentation that will be used throughout the paper. Let IU be
dent of the number of keywords. Then, through extensive
defined by
simulations in Section 5, we will show that our algorithm
(
)
X̀
outperforms both the UCB1 and EXP3 algorithms.
1
1
IU = max ` : µ
cu λu pu ≤ 1 − − (1−α)/3
Our algorithm, which we will refer to as the ADAPTIVE
k
k
u=1
BIDDING algorithm, is defined as follows.
Observe that IU is the same as the index of the prefix we
would use in the algorithm of Theorem 1 if we knew the
ADAPTIVE BIDDING
probabilities pi . We will show below that we can state the
• INITIALIZATION:
convergence of our algorithm in terms of IU rather than N ,
the total number of keywords. Intuitively speaking, it does
– For any i, let yi and xi denote the cumulative
not make sense for us to choose too frequently prefixes whose
total number of impressions and the total number
index is larger than IU since this will cause us to exceed our
of clicks, respectively, that the ad associated with
budget.
keyword i has received. Set yi = xi = 0 for all
The main result of this section is stated in the following
1 ≤ i ≤ N . Set p̂0i = 1 for all i.
theorem.
Theorem 2. Let (Gt : t ≥ 1) denote the sequence of decisions generated by the ADAPTIVE BIDDING algorithm. Let ρ = E [min {S, µ}] /µ, and assume that 1/k1−α ≤
1 − 1/k − 1/k(1−α)/3 . Then, under Assumption 1, for any
T ≥1
PT
PT
E [ZGt ]
1
4
≥ ρ 1 − (1+2α)/3 − (1−α)/3 −
T Z∗
k
k
t=1
where
M
=
32µIU
kρ 1 −
≤
1
k(1+2α)/3
32µIU
12IU2
+
+
√
3
12IU2
t=1
γt
T
PI U
PIU
i=1
kρ 1 − 1/ k
−
−λi µ
i=1 λi µ/ 1 − e
k(7−4α)/3
λi µ/ 1 − e−λi µ
k
Before we proceed to the proof of the theorem, let us compare the performance bound of the our adaptive bidding algorithm with those obtained from the UCB1 and EXP3 algorithms for the multi-armed bandit problem. It follows from
the discussion in Section 4.1 and from Theorem 1 that, under
the algorithms UCB1 and EXP3,
we have the following per
formance bounds: let κ ≡ ρ 1 − 1/k(1+2α)/3 − 2/k(1−α)/3 ,
then
PT
Zt (U CB1)
T Z∗
PT
t=1 Zt (EXP 3)
T Z∗
t=1
≥
≥
L (αN log T + βN )
T Z∗
√
3
3 LN log N
√
κ−
3
2T Z ∗
κ−
where L denotes the maximum possible profit in any one
period.
When we compare the performance of UCB1 and EXP3
with the result of Theorem 2, we observe that there is a
slightly decrease in the performance of our algorithm, from
2/k(1−α)/3 to 4/k(1−α)/3 . However, the convergence rate
of our algorithms is superior to both UCB1 and EXP3, as
indicated below.
(a) Unlike its counterparts, the convergence rate of our algorithm in Theorem 2 is independent of the number of keywords N . Instead, it depends only on IU which reflects the
maximum number of keywords (chosen based on a prefix
ordering) whose expected cost does not exceed 1 − 1/k −
1/k(1−α)/3 fraction of the budget. In practice, IU should be
significantly smaller than N .
(b) Although the convergence rate of our algorithm depends
on the expected number of search queries µ in any given
period, this quantity should be comparable to the constant
L – the maximum profit that can be obtained in any one
period – appearing in the convergence rates of UCB1 and
EXP3 above. To see this, note that by definition of L, we
have that
L ≥ E Z{1,...,IU } ≥ µρ 1 −
1
k(1+2α)/3
X
IU
πi λi pi ,
i=1
where the last inequality follows from the proof of Theorem
1. Moreover, the convergence rate of our algorithm depends
on µ through the product µIU . By definition, the value of
IU decreases as µ increases, suggesting that the convergence
rate of our algorithm should be insensitive to the increase
in the parameter µ.
M
,
T
(c) With millions of available keywords, the arrival rate λi µ
of some keywords can be extremely small. The convergence
rate of our algorithm, however, is insensitive tosmall arrival
rates because the expression λi µ/ 1 − e−λi µ is bounded
above by 1 as λi µ converges to 0.
2
(d) We should also
P∞ point out that, if we set γt = 1/t for
all t ≥ 1, then
t=1 γt < ∞. This implies that the average reward generated by our algorithm over T time periods
converges to near-optimal profits at the rate of 1/T , which
is faster than the convergence rate of other two algorithms.
(e) In their seminal paper, Lai and Robbins [16] established
a lower bound on the minimum regret of any adaptive strategy, showing that for any non-anticipatory strategy φ,
T
X
{Z ∗ − E [Zt (φ)]} = Ω(log T ),
t=1
where E [Zt (φ)] denote the expected profit in period t under
the strategy φ. This result shows that the total expected
profit over T periods of any adaptive strategy must differ
from the optimal expected profit by at least Ω(log T ); thus,
the average expected reward per time period cannot converge to the optimal expected profit faster than O ((log T )/T ).
We should note that our result in Theorem 2 is consistent
with this lower bound. According to Theorem 2, we have
that
T
X
{Z ∗ − E [ZGt ]} ≤ νZ ∗ T +
t=1
T
X
t=1
γt + M,
where ν = 1 − ρ 1 − 1/k(1+2α)/3 − 4/k(1−α)/3 . Clearly,
the upper bound on the right hand side of the above inequality increases linearly with T . Intuitively speaking, the
average expected profit under our algorithm converges to
the optimal expected profit under the prefix ordering, which
serves only as an approximation to the optimal expected
profit (from among all subsets of keywords).
Finally, in addition to nicer theoretical properties, in Section 5, through simulations on many large problem instances,
we will further show that our adaptive bidding algorithm
outperforms both UCB1 and EXP3, increasing the profits
by an average of 7%.
4.3
Sketch of the Proof of Theorem 2
We provide a sketch of the proof of Theorem 2. For more
details, the reader is referred to Appendix G. In Lemma 8 in
Appendix G, we establish an upper bound on the probability
that `t > IU , showing that for all t > 1, P {`t > IU } ≤
1/k(1−α)/3 . For any T ≥ 1, let the random variable JT ⊆
{1, 2, . . . T } denote the set of time periods when the index of
the last keyword chosen based on the prefix ordering is less
than or equal to IU , i.e. JT = {t ≤ T : Ft = 1, `t ≤ IU } .
Note that when Ft = 1, the algorithm chooses the decision
based on the prefix ordering, i.e. gt = `t . The result from
Lemma 8 shows that JT will contain a large fraction of time
periods, enabling us to restrict our attention to only time
periods in JT .
(1+2α)/3
hLet ξ = 1 − 1/k
i . For any keyword i, let ζi,T =
E
P
t∈JT :gt ≥i
pi − p̂t−1 denote the expected cumulative
i
errors during time periods in JT between our estimate of
the clickthru probability for keyword i and its true value.
Lemma 9 (in Appendix G) then relates the performance
of our algorithm to the accuracy of our estimates of the
clickthru probabilities associated with keywords 1, . . . , IU
during time periods in JT , showing that for any T ≥ 1,
interval [0,1). The clickthru rate associated with each keyword is a random number from [0.0,0.2). To generate the
probability that a search query will correspond to each keyword, we assign random numbers (chosen independently and
uniformly from [0,1)) to all keywords and normalize them.
For each problem instance, we compare our algorithm
with EXP32 and UCB13 , two standard algorithms for the
PT
PT
IU
γt
µX
3
multi-armed bandit problem (see Section 4.1 for more det=1 E [ZGt ]
≥ ρ ξ − (1−α)/3 − t=1 −ρξ
ci λi ζi,T .
tails). For our prefix-based algorithm, we set the randomT Z∗
T
T i=1
k
ization probability γt to be γt = 1/t2 for all t ≥ 1.
In Lemma 10 in Appendix G, we then show that, on avFigure 2 (a) and (b) shows the simulation results for the
erage, the cumulative errors over time periods in JT do not
SMALL experiment. Figure 2(a) shows the average profit
increase with T . In fact, it is bounded by a small constant
over time for all three algorithms when averaged over all 100
plus a term that vanishes to zero as T increases, i.e. for any
problem instances in the SMALL experiment. The dashed
T ≥ 1 and 0 < δ < 1,
lines above and below the solid lines correspond to the 95%
0
1 confidence interval. As seen from the figure, our prefix-based
λi µ
2 PI U
IU
6IU i=1 1−e−λi µ
(
) A algorithm outperforms both UCB1 and EXP3 algorithms,
1
1
8µIU
µX
ci λi ζi,T ≤ (1−α)/3 + @ 2 2 2 + (7−4α)/3
yielding an increase of about 7% in average profits. FigT i=1
T δ kρ ξ
k
k
(1 − δ)ρξ
ure 2(b) shows the distribution over 100 problem instances
(in the SMALL experiment) of the relative difference beThe result of Theorem 2 follows from the last two inequaltween the linear programming upper bound (Lemma 3) and
ities and from setting δ = 1/2. The intuition underlying
the profit under our algorithm. As seen from the figure,
the above inequality is the following. Consider any keyword
the profit generated by our algorithm differs from the linear
i ≤ IU . Suppose JT consists of t1 < . . . < tw and consider
programming bound by about 30%. Since the optimal soluany th+1 ∈ JT . During the previous h periods t1 , . . . , th , the
tion of the linear program provides an upper bound on the
total expected number of queries associated with keyword i
optimal expected profit, this result implies that the average
is λi µh. By the definition of JT , the prefix chosen during
profit generated by our algorithm differs from the optimal
these h periods has final index at most IU . We can thus
by at most 30%.
apply Lemma 5 to show that with probability at least ξ we
Note that for the SMALL experiments, the value of k is
have enough budget to display ads for each of these λi µh
about 1200 and the value of α is around 0.33 (see Assumpqueries of keyword i. Hence, the expected cumulative numtion 1 for more details). The bound of Theorem 2 implies
ber of impressions that the ad for keyword i will receive just
that for these values our algorithm should converge to at
before period th+1 is at least Ω (ξλi µh). We can then apply
least .16 of the linear programming bound, but our simulaa Chernoff bound to show that, at time th+1 , the error betion results show that our algorithm performs much better
tween our empirical estimate
of
the
probability
and
its
true
than this. As indicated by Theorems 1 and 2, as the value of
value is at most O e−ξλi µh , declining to zero exponentially
k increases and the value of α decreases, the average profit
in h. Thus, the total errors across all time periods in JT
should approach the optimal. This theorem is confirmed by
(summing over h) will be finite and independent of T .
the simulation results for the LARGE experiments, which is
5. EXPERIMENTS
shown in Figure 2 (c) and (d). For the LARGE experiment,
In the previous section, we developed an alternative algothe value of k and α are around 3,333 and 0.22, respectively.
rithm that exploits the prefix ordering of the keywords. We
For these values, the bound of Theorem 2 implies that our
showed that the average expected profit generated by the
algorithm should converge to at least .61 of the linear proalgorithm converges to near-optimal profits and the convergramming bound, and again in our simulation we are doing
gence rate is independent of the number of keywords. In
better than this. In Figure 2(c), we see that the average
this section, we compare the performance of the algorithm
profit generated by our algorithms is about 20% higher than
against UCB1 and EXP3 algorithms for multi-armed bandit
the profits under UCB1 and EXP3 algorithms. In addition,
problems.
as shown in Figure 2(d), the average profit differs from the
We ran two experiments: SMALL and LARGE. Each exoptimal by at most 20%.
periment has 100 randomly generated problem instances.
Appendix F provides additional discussion of the impact
In the SMALL experiment, each problem instance has 8,000
of different choices of randomization parameters γt and inikeywords, 200 time periods, $400 budget per period, and the
tialization of the estimated clickthru probabilities.
number of search queries in each time period has a Poisson
distribution with a mean of 40,000. In the LARGE experi6. MODEL DISCUSSIONS AND EXTENSIONS
ment, each problem instance has 50,000 keywords, 200 time
In this section, we provide additional discussion on the
periods, $1,000 budget per period, and the number of search
strengths and weaknesses of our formulation, along with posqueries in each time period has a Poisson distribution with
sible extensions of our model. First, in most search-based
a mean of 150,0001 .
advertising services, changing the bid price of each keyword
In both sets of experiments, for each problem instance,
will alter the position of the ad on the search result page.
the cost-per-click of each keyword is chosen uniformly at
random from the interval [0.1,0.3) and the profit-per-click
2
For EXP3, we tested 3 different settings for the “exploration probability”:
of each keyword is chosen uniformly at random from the
1%, 5%, and 10%. All three settings exhibit similar performance. So, we
1
Since the mean number of search queries µ is very large, to speed up
the simulation, we approximate the Poisson distribution
using a Gaussian
√
distribution with mean µ and standard deviation µ.
decide to set the exploration probability to 5%.
3
The UCB1 algorithm requires us to try each of the N prefix sets once
during the first N periods. We choose these sets by random sampling from
among those that have not been tried.
Distribution of Relative Errors Between the LP Bound and
the Profit Under the Prefix Algorithm
(100 Problem Instances, 8K Keywords, Avg. 40K Queries, $400 Budget)
Average Profit Among UCB1, EXP3, and Prefix-Based Adaptive Algorithm
(100 Problem Instances, 8K Keywords, Avg. 40K Queries, $400 Budget)
NOTE: The dash lines represent 95% confidence interval
R u n n in g A v e r a g e P r o fit U p T o
T im e t
% o f P r o b le m In s ta n c e s
25%
1120
1100
1080
20%
Prefix Ordering
1060
1040
1020
1000
980
960
940
920
900
UCB1
15%
10%
EXP3
5%
0%
1
51
101
<= 27.8%
151
28.0%
28.2%
28.4%
28.6%
28.8%
29.0%
<= 30%
Relative Error (%)
Time Period
(a)
(b)
Average Profit Among UCB1, EXP3, and Prefix-Based Adaptive Algorithm
(100 Problem Instances, 50K Keywords, Avg. 150K Queries, $1,000 Budget)
NOTE: The dash lines represent 95% confidence interval
Distribution of Relative Errors Between the LP Bound and
the Profit Under the Prefix Algorithm
(100 Problem Instances, 50K Keywords, Avg. 150K Queries, $1,000 Budget)
45%
R u n n in g A v e r a g e P r o fit U p T o
T im e t
3600
3500
3400
3300
3200
3100
3000
2900
2800
2700
2600
2500
% o f P r o b le m In s t a n c e s
40%
35%
Prefix Ordering
30%
UCB1
25%
20%
15%
EXP3
10%
5%
0%
1
51
101
151
Time Period
(c)
<= 18%
18.5%
19.0%
19.5%
20.0%
20.5%
<= 21.0%
Relative Error (%)
(d)
Figure 2: Figures in the top ((a) and (b)) and bottom ((c) and (d)) rows shows the result for the SMALL and LARGE
experiments, respectively. The figures in the left hand column ((a) and (c)) show a comparison of the average profit
over time under UCB1, EXP3, and our prefix-based algorithms. The figures on the right hand column ((b) and (d))
show the relative difference between the linear programming bound and the average profit generated by our algorithm
over 100 problem instances.
This work focuses primarily on identifying profitable subsets
of keywords. Our formulation assumes that the bid price of
each keyword remains constant and the ad will always appear in the same spot on search result page. Incorporating
the impact of the ad’s position on the search result page will
allow for a richer class of models. Finding an appropriate
model that enables the development of efficient computational methods remains an open question. Another open
question is that of finding an appropriate model for competitive bidding for keywords. Our model assumes that the cost
of a keyword is fixed. Of course, these costs depend on what
others bid for the same keywords. Incorporating competitive aspects into our model is an interesting and challenging
research direction.
7.
REFERENCES
[1] G. Aggarwal and J. D. Hartline, “Knapsack Auctions,” Workshop on
Sponsored Search Auctions, EC 2005.
[2] R. Agrawal, “The continuum-armed bandit problem,” SIAM J. Control and
Optimization, vol. 33, pp. 1926–1951, 1995.
[3] P. Auer, N. Cesa-Bianchi, P. Fischer, “Finite-time Analysis of the
Multiarmed Bandit Problem,” Machine Learning, pp. 235–256, 2002.
[4] P. Auer, N. Cesa-Bianchi, Y. Freund, R. E. Schapire, “Gambling in a
rigged casino: The adversarial multi-armed bandit problem,” SODA 1995.
[5] M-F. Balcan, A. Blum, J. D. Hartline, Y. Mansour “Sponsored Search
Auction Design via Machine Learning,” Workshop on Sponsored Search
Auctions, EC 2005.
[6] J. Chen, D. Liu, A. B. Whinston, “Designing Share Structure in Auctions
of Divisible Goods,” Workshop on Sponsored Search Auctions, EC 2005.
[7] B. Dean, M.X. Goemans and J. Vondrak, “Approximating the Stochastic
Knapsack Problem: The Benefit of Adaptivity,” FOCS 2004, pp. 208–217.
[8] E. Even-Dar, S. Mannor, and Y. Mansour, “PAC bounds for multi-armed
bandit and Markov decision processes,” Fifteenth Annual Conference on
Computational Learning Theory, pp. 255-270, 2002.
[9] J. Goodman, “Pay-Per-Percentage of Impressions: An Advertising
Method that is Highly Robust to Fraud,” Workshop on Sponsored Search
Auctions, EC 2005.
[10] R.L. Graham, E.L. Lawler, J.K. Lenstra, and A.H.G. Rinnooy Kan,
“Optimization and approximation in deterministic sequencing and
scheduling: A survey,” Annals of Discrete Mathematics, 5:287–326, 1979.
[11] B. J. Jansen and M. Resnick, “Examining Searcher Perceptions of and
Interactions with Sponsored Results,” Workshop on Sponsored Search
Auctions, EC 2005.
[12] B. Kitts, P. Laxminarayan, B. LeBlanc, R. Meech, “A Formal Analysis of
Search Auctions Including Predictions on Click Fraud and Bidding
Tactics,” Workshop on Sponsored Search Auctions, EC 2005.
[13] R. Kleinberg, “Online Decision Problems with Large Strategy Sets,”
Ph.D. disseration, MIT 2005.
[14] R. Kleinberg, “Nearly tight bounds for the continuum-armed bandit
problem,” in NIPS 2005.
[15] S. R. Kulkarni and G. Lugosi, “Finite-time lower bounds for the
two-armed bandit problem,” IEEE Trans. Aut. Control, pp. 711-714, 2000.
[16] T. L. Lai and H. Robbins, “Asymptotically efficient adaptive allocation
rules,” Advances in Applied Mathematics, vol. 6, pp. 4-22, 1985.
[17] S. Mannor and J. N. Tsitsiklis, “Lower bounds on the sample complexity
of exploration in the multiarmed bandit problem,” Sixteenth Annual
Conference on Computational Learning Theory, pp. 418-432. Springer, 2003.
[18] C. Meek, D. M. Chickering, D. B. Wilson, “Stochastic and
Contingent-Payment Auctions,” Workshop on Sponsored Search
Auctions, EC 2005.
[19] A. Mehta, A. Saberi, U. Vazirani, and V. Vazirani, “Adwords and
Generalized Online Matching,” FOCS 2005.
[20] M. Mitzenmacher and E. Upfal, Probability and Computing: Randomized
Algorithms and Probabilistic Analysis, Cambridge, 2005.
[21] D. C. Parkes and T. Sandholm,“Optimize-and-Dispatch Architecture for
Expressive Ad Auctions,” Workshop on Sponsored Search, EC 2005.
[22] D. Pollard, Convergence of Stochastic Processes, Springer Verlang, 1994.
[23] S. Sahni, “Approximate Algorithms for the 0/1 Knapsack Problem,”
Journal of the ACM, 22:115–124, 1975.
[24] M. Zinkevich, “Online convex programming and generalized infinitisimal
gradient ascent,” in Proceeding of the 20th International Conference on Machine
Learning, pp. 928–936, 2003.
APPENDIX
A.
FIGURES
Figure 3: Examples of ads shown under search-based advertising programs. The left and right figures show
the result of searching for “cheap Europe trip” (on 10/18/05) on Google and Yahoo!, respectively. The
links on the top (shaded) area and on the right hand column of the search result page correspond to ads by
companies participating in the search-based advertising programs.
(a)
(b)
Figure 4: Figure (a) is a screen shot showing the estimated number of search queries in Yahoo! in September 2005
related to the keyword “cheap Europe trip”. The data is available through the Keyword Selector Tool offered by Yahoo!
at http://inventory.overture.com/d/searchinventory/suggestion/. Figure (b) shows the current bids (as of 10/18/05)
for the same keyword under the Yahoo! Sponsored Marketing program. This information is publicly available through
the Bid Tools offered by Yahoo! at http://uv.bidtool.overture.com/d/search/tools/bidtool/index.jhtml.
B.
PROOF OF LEMMA 1
Proof. Recall that BrA denotes the remaining account balance when the rth search query arrives and we bid on keywords
in A. This implies that the event {BrA ≥ ci } depends only on the previous r − 1 search queries, and does not depend on
whether the consumer clicks on the ad associated with keyword i when the rth search query arrives nor does it depend on
Distribution of the Median Cost Per Click for 842 Travel-Related Keywords
(10/18/2005, Yahoo! Sponsored Search Program)
18%
16%
% of Total Keywords
14%
12%
10%
8%
6%
4%
2%
0%
<= 0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.50+
Median CPC ($)
Figure 5: A distribution of the median bid price under the Yahoo! Sponsored Search program for 842 travelrelated keywords collected during a 2-day period from 10/17/05 - 10/18/05. A complete list of 842 keywords
used in this study can be found at http://www.orie.cornell.edu/∼paatrus/keywordlist.txt.
Distribution of the Average Number of Daily Searches
for 842 Travel-Related Keywords in September 2005
(Yahoo! Sponsored Search Program)
80%
70%
% of Total Keywords
60%
50%
40%
30%
20%
10%
0%
<=10
20
30
40
50
60
70
80
90
100
100+
Average Daily Searches in September 2005
Figure 6: A distribution of average daily searches on Yahoo! for 824 travel-related keywords during September
2005.
what the rth search query corresponds to. Thus,
n
P Qr = i, BrA ≥ ci , Xri = 1S ≥ r
o
h i
=
E 1 Qr = i, BrA ≥ ci , Xri = 1 S ≥ r
=
E 1 BrA ≥ ci S ≥ r E 1 (Qr = i, Xri = 1) S ≥ r
=
λi pi E 1 BrA ≥ ci S ≥ r
=
λi pi P BrA ≥ ci S ≥ r
h h n
i
h
i
i
o
where the third equality follows from our assumption that the clickthrus of the ads are independent of the keyword corresponding to the rth search query, and both of these random variables are independent of the total number of arrivals S. The
second equality in the lemma follows directly from the above result.
C.
PROOF OF LEMMA 2
The proof of Lemma 2 makes use of the following bound on the tail of a Poisson random variable. Since the proof follows
from the standard application of Chernoff bound (cf. Theorem 4.5 in [20]), we omit the details.
Lemma 6. If S is a Poisson random variable with mean µ, then for any 0 < δ < 1, P {S ≤ (1 − δ)µ} ≤ e−µδ
Here is the proof of Lemma 2.
2
/2
.
Proof. We have that
bµc
X
P {S ≥ r} =
r=1
bµc
X
(1 − P {S < r}) = bµc −
r=1
P {S ≤ r − 1}
r−1
S≤
µ
µ
P
=
P {S ≤ r − 1}
r=1
It follows from Lemma 6 that for any 1 ≤ r ≤ bµc,
bµc
X
(
≤ exp −µ 1 −
r−1
µ
2 )
2
= e−(1+µ−r)
2
/2µ
≤ e−(1+bµc−r)
2
/2µ
,
which implies that
bµc
X
P {S ≤ r − 1} ≤
r=1
bµc
X
e−(1+bµc−r)
2
/2µ
=
r=1
bµc
X
e−r
2
Z
/2µ
∞
≤
2
e−x
/2µ
dx =
0
r=1
√
2πµ
2
Therefore, we have that
√ √ √
bµc
2πµ
2π
2π
1
3
=µ
− √
P {S ≥ r} ≥ bµc −
≥µ 1− − √
≥µ 1− √
,
2
µ
2 µ
µ
2 µ
µ
r=1
√
where the last inequality follows from our assumption that µ > 1, which implies that 1/ µ > 1/µ and the fact that
√
2π/2 ≤ 2.
bµc
X
D.
PROOF OF LEMMA 4
Proof. Since the second inequality
follows directly
from Lemma 3, it suffices to prove only the first inequality. Let m
Pm+1
P
denote the index such that µ m
i=1 ci λi pi Since π1 /c1 ≥ π2 /c2 ≥ · · · ≥ πn /cn , it follows from the
i=1 ci λi pi ≤ 1 < µ
definition of Z LP that
Z
LP
=µ
m
X
πi λi pi + πm+1 λm+1 pm+1
i=1
P
1−µ m
i=1 ci λi pi
cm+1 λm+1 pm+1
It is easy to verify that u < m since ci λi pi µ ≤ 1/k1−α ≤ 1/k(1−α)/3 for all i by Assumption 1 (c). Moreover, since
π1 /c1 ≥ · · · ≥ πN /cN , we have that
Pm
Pu
Pu
πi λi pi
πi λi pi
πm+1 λm+1 pm+1
i=u+1 πi λi pi
Pi=1
P
Pi=1
≥
and
≥
,
u
m
u
i=1
ci λi pi
which implies that
m
X
u
X
πi λi pi ≤
i=u+1
i=u+1
ci pi λi
! Pm
i=u+1 ci λi pi
P
u
i=1 ci λi pi
πi λi pi
i=1
i=1
ci λi pi
cm+1 λm+1 pm+1
and
u
X
πm+1 λm+1 pm+1 ≤
!
πi λi pi
i=1
cm+1 λm+1 pm+1
Pu
i=1 ci λi pi
Putting everything together, we have
Z LP
=
µ
u
X
πi λi pi + µ
i=1
≤
µ
µ
u
X
µ
u
X
!
P
!
Pm
m
1
i=u+1 ci λi pi
1+ P
+
u
µ
i=1 ci λi pi
πi λi pi
πi λi pi
i=1
=
πi λi pi + πm+1 λm+1 pm+1
i=u+1
u
X
i=1
≤
m
X
i=1
µ
Pu
P
P
i=1
1
ci λi pi
,
which is the desired result.
E.
PROOF OF LEMMA 5
The proof of Lemma 5 makes use of the following result.
Lemma 7. For any A ⊆ {1, 2, . . . , N },
V ar
r−1 X
X
s=1 i∈A
ci 1 (Qs = i, Xsi
m
cm+1 λm+1 pm+1 1 − µ i=1 ci λi pi
Pu
·
cm+1 λm+1 pm+1
i=1 ci λi pi
1−µ m
ci λi pi
i=u+1 ci λi pi
Pu i=1
1+ P
+
u
c
λ
p
µ
c
λ
i=1 i i i
i=1 i i pi
!
πi λi pi
P
1−µ m
i=1 ci λi pi
cm+1 λm+1 pm+1
!
r−1 X
= 1) S ≥ r ≤
ci λi pi
k
i∈A
.
Proof. Since Qs ’s are independent and for any i, Xsi ’s are also independent, it suffices to show that for any s < r,
!
1X
V ar
ci 1 (Qs = i, Xsi = 1) S ≥ r ≤
ci λi pi
k i∈A
i∈A
Note that since Qs and Xsi are independent, we have that E 1 (Qs = i, Xsi = 1) S ≥ r = λi pi , which implies that
!
X
V ar
ci 1 (Qs = i, Xsi = 1) S ≥ r
i∈A
"
!2 #
X
=E
ci (1 (Qs = i, Xsi = 1) − pi λi ) S ≥ r
i∈A
"
#
X 2
2
=
ci E (1 (Qs = i, Xsi = 1) − pi λi ) S ≥ r
i∈A
"
#
X
+
ci cj E (1 (Qs = i, Xsi = 1) − λi pi ) · (1 (Qs = j, Xsj = 1) − λj pj ) S ≥ r
i,j∈A:i6=j
X 2
X
X
ci λi pi (1 − λi pi ) −
=
i∈A
ci cj λi pi λj pj
i,j∈A:i6=j
1X
ci λi pi ,
k i∈A
≤
where the third equality follows from the fact that 1(Qs = i)1(Qs = j) = 0 for all i 6= j. The final inequality follows from
Assumption 1(c) that ci ≤ 1/k for all i.
Here is the proof of Lemma 5.
Proof. For any keyword i ∈ H, let TrH denote the total amount of money that we have already spent when rth search
query arrives. Then,
H
H
P
≥ ci S ≥ r = P Tr ≤ 1 − ci S ≥ r = 1 − P Tr > 1 − ci S ≥ r .
H
We will focus on developing an upper bound for the expression P Tr > 1 − ci S ≥ r .
BrH
By definition of TrH , we know that, with probability 1,
TrH =
r−1 X
X
ci 1 Qs = i, BrH ≥ ci , Xsi = 1 ≤
s=1 i∈H
r−1 X
X
ci 1 (Qs = i, Xsi = 1) .
s=1 i∈H
Moreover, by the hypothesis of the lemma,
E
"r−1
XX
ci 1 (Qs = i, Xsi
s=1 i∈H
#
X
X
1
1
= 1) S ≥ r = (r − 1)
ci λi pi ≤ µ
ci λi pi ≤ 1 − − (1−α)/3 ,
k
k
i∈H
i∈H
and it follows from the Chebyshev’s inequality and the previous lemma that
P
TrH
> 1 − ci S ≥ r
≤
P
(r−1
XX
ci 1 (Qs = i, Xsi
)
= 1) > 1 − ci S ≥ r
s=1 i∈H
V ar
Pr−1 P
s=1
≤
≤
≤
≤
≤
i∈H ci 1 (Qs = i, Xsi = 1) S ≥ r
1 − ci − (r − 1)
r−1
k
P
i∈H
1 − ci − (r − 1)
P
ci λi pi
P
i∈H
ci pi λi
µ P
i∈H ci λi pi
k
2
P
1
1 − k − µ i∈H ci pi λi
(2−2α)/3
X
k
k
1
k(1+2α)/3
µ
ci λi pi
i∈H
i∈H
2
ci pi λi
2
which is the desired result.
F.
IMPACT OF RANDOMIZATION AND INITIALIZATION
In this section, we explore the impact of different choices of randomization and initialization. We ran our algorithm on the
100 problem instances in the LARGE experiment using four different randomization and initialization settings:
(a) No randomization (γt = 0 ∀t) and set the initial estimate of the clickthru probability associated with the ad of each
keyword to zero (p̂0i = 0 ∀i).
(b) 1/t randomization (γt = 1/t ∀t) and set the initial estimate of the clickthru probability associated with the ad of each
keyword to 1 (p̂0i = 1 ∀i).
(c) 1/t2 randomization (γt = 1/t2 ∀t) and set the initial estimate of the clickthru probability associated with the ad of each
keyword to 1 (p̂0i = 1 ∀i).
(d) No randomization (γt = 0 ∀t) and set the initial estimate of the clickthru probability associated with the ad of each
keyword to 1 (p̂0i = 1 ∀i).
Figure 7 shows the average profit over time under each of the four parameter settings. Recall that our ADAPTIVE
ALGORITHM sets the initial estimate of the clickthru probability to 1, corresponding to cases b), c), and d). In this case,
we clearly see the benefits of randomization; allowing for random decisions with small probability (either γt = 1/t or 1/t2 )
yields significantly higher average profits. It is interesting to note that although setting γt = 1/t yields slightly higher profits
over γt = 1/t2 , the difference is quite small.
By setting the initial clickthru probability for each keyword to 1, we limit the number of keywords chosen in each time
period. Initially, we will select a small subset of keywords and gradually enlarge the set of keywords as we observe more data
on impressions and clicks. The opposite extreme of this approach is to set the initial clickthru probability for each keyword
to 0, implying that we will select all keywords in the first time period. Although this setting yields the highest average profit,
as seen from the top line in Figure 7, we do not have a formal proof of the convergence under this setting.
Average Profits for Prefix-Based Algorithm Under Different
Randomizations and Initializations (LARGE experiment)
No Randomization with Initial Prob. Estimates Equal to Zero
1/t Randomization with Initial Prob. Estimates Equal to One
1/t2 Randomization with Initial Prob. Estimates Equal to One
No Randomization with Initial Prob. Estimates Equal to One
3600
A v e ra g e P ro f it
U p t o T im e t
3400
3200
3000
2800
2600
2400
1
51
101
Time Period
151
Figure 7: Impact of different randomization and initialization. Figure 7 shows the results of applying our algorithm
under four different parameter settings.
G.
PROOF OF THEOREM 2
The proof of Theorem 2 depends on a series of lemmas. The first lemma establishes an upper bound on the probability
that `t > IU . The proof appears in Appendix G.1.
Lemma 8. For any t ≥ 1, P {`t > IU } ≤ 1/k(1−α)/3 .
The next lemma establishes a lower bound on the average total profit. The proof of this result appears in Appendix G.2.
Lemma 9. For any T ≥ 1,
PT
E [ZGt ]
3
1
≥ ρ 1 − (1+2α)/3 − (1−α)/3
T Z∗
k
k
PT
t=1
t=1
−
γt
ρ
−
T
T
1
1−
µ
k(1+2α)/3
IU
X
2
ci λi E 4
i=1
3
t−1
5
pi − p̂
i
X
t:gt ·1(Ft =1 ; `t ≤IU )≥i
We should note that if JT = {t ≤ T : Ft = 1, `t ≤ IU }, then since the keywords are indexed starting from 1, we have that
X
X
pi − p̂t−1 =
i
pi − p̂t−1 ,
i
t∈JT :gt ≥i
t:gt ·1(Ft =1 ; `t ≤IU )≥i
which consistent with the notations in Section 4.3.
The next result establishes an upper bound on the accumulated difference between the empirical estimate of the clickthru
probability pbt−1
and its true value pi , corresponding to the second term in the above lemma. The proof appears in Appendix
i
G.3.
Lemma 10. For any T ≥ 1 and 0 < δ < 1,
µ
IU
X
2
ci λi E 4
i=1
3
t−1
5 ≤
pi − p̂
i
X
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
T
k(1−α)/3
+
8µIU
δ 2 kρ2 1 −
2 +
1
k(1+2α)/3
6IU2
PI U
i=1
λi µ/ 1 − e−λi µ
k(7−4α)/3 (1 − δ)ρ 1 −
1
k(1+2α)/3
Finally, here is the proof of Theorem 2.
Proof. It follows from Lemma 9 and 10 that, for any 0 < δ < 1,
PT
E [ZGt ]
T Z∗
t=1
≥
ρ 1−
1
k(1+2α)/3
−ρ 1 −
−ρ 1 −
−
k(1−α)/3
1
k(1+2α)/3
1
k(1+2α)/3
PT
4
t=1
−
0
1 B
@
T
γt
T
1−
0
C
2 A
8µIU
δ 2 kρ2
1
1
k(1+2α)/3
P
1
U
6IU2 Ii=1
λi µ/ 1 − e−λi µ
1 @
A
1
T k(7−4α)/3 (1 − δ)ρ 1 −
k(1+2α)/3
Setting δ = 1/2, we get that
PT
E [ZGt ]
1
4
≥ ρ 1 − (1+2α)/3 − (1−α)/3
T Z∗
k
k
t=1
PT
−
t=1
γt
T
−
M
,
T
where
M
=
32µIU
kρ 1 −
1
k(1+2α)/3
+
12IU2
PI U
i=1 λi µ/ 1
k(7−4α)/3
− e−λi µ
≤
32µIU
+
√
3
kρ 1 − 1/ k
12IU2
PI U
i=1
λi µ/ 1 − e−λi µ
k
√
where the inequality follows from the fact that 0 ≤ α < 1, which implies that 1 − 1/k(1+2α)/3 ≥ 1 − 1/ 3 k and k(7−4α)/3 ≥ k,
which is the desired result.
G.1
Proof of Lemma 8
Proof. For any keyword u, let Yu (t) denote the cumulative total number of impressions that the ad associated with
keyword u has received up to the end of period t. It follows from the definition of `t that
(
P {`t > IU } =
P
µ
(
=
P
µ
(
=
P
µ
(
≤
P
µ
(
≤
P
µ
IX
u +1
)
cu λu p̂t−1
u
u=1
IX
u +1
1
2
≤ 1 − − (1−α)/3
k
k
p̂t−1
u
cu λu
− pu
u=1
IX
u +1
p̂t−1
u
cu λu pu −
IU +1
X
2
1
≤ 1 − − (1−α)/3 − µ
cu λu pu
k
k
u=1
X
IU +1
≥µ
u=1
IX
u +1
u=1
t−1 cu λu pu − p̂u
u=1
X
IU +1
cu λu pu −
>
)
2
1
cu λu pu − 1 − − (1−α)/3
k
k
)
)
1
k(1−α)/3
p̂t−1
1 (Yu (t
u
)
− 1) ≥ 1) >
u=1
1
k(1−α)/3
,
where the first inequality follows from the definition of IU , which implies that 1 − 1/k − 1/k(1−α)/3 < µ
Therefore, we have that
X
IU +1
µ
cu λu pu − 1 −
u=1
1
2
− (1−α)/3
k
k
>
PIU +1
u=1
cu λu pu .
1
.
k(1−α)/3
The final inequality follows from the fact that for any keyword u such that Yu (t − 1) = 0, we have p̂t−1
= 1, which implies
u
that pu − p̂tu ≤ 0.
By Markov’s inequality, we have the following inequalities
(
P
X
IU +1
µ
cu λu pu −
p̂t−1
1 (Yu (t
u
)
− 1) ≥ 1) >
u=1
E
n
PIU +1
µ
u=1
1
k(1−α)/3
o2 cu λu pu − p̂t−1
1 (Yu (t − 1) ≥ 1)
u
≤
2
(1/k(1−α)/3 )
X
IU +1
= k(2−2α)/3
u=1
+2k
(2−2α)/3
(µcu λu )2 E
X
h
pu − p̂t−1
u
µ2 cu cv λu λv E
2
i
1 (Yu (t − 1) ≥ 1)
pu − p̂t−1
u
pv − p̂t−1
1 (Yu (t − 1) ≥ 1, Yv (t − 1) ≥ 1)
v
u<v
Now, consider any 1 ≤ u < v ≤ IU + 1,
E
=
pu − p̂t−1
u
X X
P {Yu (t − 1) = yu , Yv (t − 1) = yv } E
yu ≥1 yv ≥1
=
X X
yu ≥1 yv ≥1
pv − p̂t−1
1 (Yu (t − 1) ≥ 1, Yv (t − 1) ≥ 1)
v
pu −
"
P {Yu (t − 1) = yu , Yv (t − 1) = yv } E
p̂t−1
u
pv −
Yu (t
p̂t−1
v
!
yu
1 X s
(Xu − pu )
yu s=1
− 1) = yu , Yv (t − 1) = yv
!
yv
1 X s
(Xv − pv ) Yu (t − 1) = yu , Yv (t − 1) = yv
yv s=1
#
= 0,
where Xus is an indicator random variable indicating whether or not the consumer clicks on the ad associated with keyword
u during the sth time when the ad associated with keyword u receives an impression. The final equality follows from the
fact Xus and Xvs are independent of Yu (t) and Yv (t). Moreover, Xus and Xvt are also independent for any s and t and
E [Xus − pu ] = E [Xvs − pv ] = 0 for all s.
In addition,
h
E
pu − p̂t−1
u
=
X
2
i
1 (Yu (t − 1) ≥ 1)
yu ≥1
=
X
"
P {Yu (t − 1) = yu } E
yu ≥1
=
X
pu − p̂t−1
u
P {Yu (t − 1) = yu } E
"
P {Yu (t − 1) = yu } E
yu ≥1
=
X
P {Yu (t − 1) = yu }
yu ≥1
2 Yu (t − 1) = yu
!
yu
1 X s
(Xu − pu )
yu s=1
!
#
Yu (t − 1) = yu
yu
1 X s
(Xu − pu )
yu s=1
yu
1 X s
2
(X
−
p
)
u
u
Yu (t − 1) = yu
2
yu s=1
#
pu (1 − pu )
yu
≤ pu
where the third equality follows from the fact that Xus ’s are independent and identically distributed Bernoulli random variables
with parameter pu and they are independent of Yu (t). Moreover, E [Xus − pu ] = 0 for all s. The fourth equality follows from
the fact that Xus are Bernoulli random variable with parameter pu .
Putting everything together, we have that
(
P {`t > IU }
≤
P
X
IU +1
µ
cu λu pu −
p̂t−1
1 (Yu (t
u
)
− 1) ≥ 1) >
u=1
X
1
k(1−α)/3
IU +1
≤
k(2−2α)/3
(µcu λu )2 pu
u=1
≤
=
≤
k(2−2α)/3
1
k
1
k1−α
µ
(1−α)/3
X
X
IU +1
µcu λu pu
u=1
IU +1
cu λu pu
u=1
1
k(1−α)/3
where the third inequality follows from Assumption 1(c), which implies that, for any keyword i, µci λi ≤ 1/k1−α . The final
inequality follows from the definition of IU , which implies that
X
IU +1
µ
u=1
G.2
cu λu pu ≤ 1 −
1
1
1
− (1−α)/3 + 1−α ≤ 1
k
k
k
Proof of Lemma 9
Proof. Recall that, for any t ≥ 1, we have Gt = {1, . . . , gt }. Also, let S t denote the total number of search queries in
time t, and let Brgt denote the remaining account balance in period t just before the arrival of the rth search query associated
with keyword i (in period t), assuming that we have bid on the set of keywords in {1, 2, . . . , gt } in period t. Also, let the
t
random variable Qtr denote the keyword associated with the rth search query in period t. Also, let Xri
denote a Bernoulli
random variable indicating whether or not the consumer clicks on the ad associated with keyword i during the arrival of the
rth search query in period t. Then, we have that
T
X
E [ZGt ]
=
t=1
T
X
3
2
gt
St X
X
t
g
t
t
πi 1 Qr = i, Br ≥ ci , Xri = 1 5
E4
r=1 i=1
t=1
=
T
X
E
t=1
=
T
X
"
T
X
E
t=1
=
T
X
πi 1
"
"g
t
X
gt ∞
X
X
"g
t
X
t=1
=
i, Brgt
Qtr
πi 1
i=1 r=1
πi λi pi
≥
=
"
∞
X
E 1 Brgt
πi λi pi
∞
X
t
ci , Xri
t
= 1, S ≥ r
#
##
≥
= 1, S ≥ r gt
##
≥ ci , S t ≥ r g t
#
t
i, Brgt
r=1
i=1
E
Qtr
i=1 r=1
E E
t=1
=
"g ∞
t X
X
t
ci , Xri
t
1 Brgt ≥ ci , S ≥ r
r=1
i=1
where the penultimate equality follows from Lemma 1.
Recall that Ft ∈ {0, 1} denotes a binary random variable indicating whether or not we choose a decision based on the prefix
ordering in period t, where Ft = 1 implies that gt equals to `t (the decision based on prefix ordering), and when Ft = 0, gt is
a random decision. Note that, by definition, Ft = 1 with probability 1 − γt . Then, we have that
T
X
E [ZGt ]
=
t=1
T
X
t=1
≥
T
X
t=1
=
=
≥
E
"g
t
X
"
πi λi pi
∞
X
Brgt
1
≥ ci , S ≥ r
#
r=1
i=1
E 1 (Ft = 1 ; `t ≤ IU ) ·
gt
X
πi λi pi
∞
X
1
Brgt
t
≥ ci , S ≥ r
#
r=1
i=1
##
E E 1 (Ft = 1 ; `t ≤ IU ) ·
πi λi pi
1
≥ ci , S ≥ r `t , Ft
t=1
r=1
i=1
"
"g
##
T
∞
t
X
X
X
gt
t
E 1 (Ft = 1 ; `t ≤ IU ) · E
πi λi pi
1 Br ≥ ci , S ≥ r `t , Ft
t=1
r=1
i=1
2
2
33
bµc
gt
T
X
X
X
g
t
E 41 (Ft = 1 ; `t ≤ IU ) · E 4
πi λi pi
1 Br t ≥ ci , S ≥ r `t , Ft 55
T
X
"
"
t=1
≥
t
ρ 1−
1
X
T
k(1+2α)/3
gt
X
∞
X
i=1
r=1
"
Brgt
E 1 (Ft = 1 ; `t ≤ IU ) · µ
gt
X
t=1
t
#
πi λi pi
i=1
where the final inequality follows from the fact that when Ft = 1 and `t ≤ IU , we have that gt = `t , which implies that
2
E4
gt
X
i=1
πi λi pi
bµc
X
r=1
1 Brgt
3
≥ ci , S t ≥ r ` t , F t 5
=
`t
X
i=1
=
≥
=
2
3
bµc
X
πi λi pi E 4
1 Br`t ≥ ci , S t ≥ r `t , Ft 5
r=1
) (
)
t
t
P
≥ ci `t , Ft , S ≥ r P S ≥ r`t , Ft
πi λi pi
r=1
i=1
(
)
X
bµc
`t
X
1
1 − (1+2α)/3
πi λi pi
P S t ≥ r`t , Ft
k
r=1
i=1
0
1
X
bµc
`t
X
1
@
P St ≥ r A 1 −
πi λi pi
`t
X
X
bµc
(
Br`t
k(1+2α)/3
r=1
=
ρ 1−
1
k(1+2α)/3
µ
gt
X
i=1
πi λi pi ,
i=1
where the inequality follows from Lemma 5 and the fact that, conditioned on `t and S t ≥ r, Br`t is independent of Ft . The
penultimate equality follows from the fact that S t is independent of `t and Ft . The final equality follows from the observation
that
bµc
X
P S t ≥ r = E min S t , µ
= ρµ.
r=1
The above result implies that
2
gt
St X
X
1 (Ft = 1 ; `t ≤ IU ) · E 4
πi λi pi 1 (Brgt ≥ ci )
r=1 i=1
3
gt
X
1
πi λi pi
`t , Ft 5 ≥ ρ 1 − (1+2α)/3 1 (Ft = 1 ; `t ≤ IU ) µ
k
i=1
which gives the desired result.
Thus, we have that
T
X
E [πGt ] ≥
ρ 1−
t=1
k(1+2α)/3
≥
ρ 1−
X
T
1
Z
k(1+2α)/3
E 1 (Ft = 1 ; `t ≤ IU ) · µ
t=1
1
"
∗
T
X
gt
X
#
πi λi pi
i=1
"
E 1 (Ft = 1 ; `t ≤ IU ) · µ
t=1
gt
X
#
ci λi pi
i=1
where the final inequality follows from the fact that if Ft = 1 and `t ≤ IU , then we have that with probability 1, gt = `t and
by definition of IU ,
µ
gt
X
ci λi pi = µ
`t
X
ci λi pi ≤ 1 −
i=1
i=1
1
1
− (1−α)/3 ,
k
k
and it follows from Lemma 4 that
`t
X
πi λi pi ≥ Z ∗
`t
X
ci λi pi .
i=1
i=1
Therefore, we have that
1 (Ft = 1 ; `t ≤ IU ) · µ
gt
X
πi λi pi ≥ Z ∗ 1 (Ft = 1 ; `t ≤ IU ) · µ
gt
X
i=1
ci λi pi .
i=1
Let C ∗ = 1 − 1/k − 2/k(1−α)/3 − 1/k1−α . Note that by definition of `t , we have that
µ
`t
X
ci λi p̂t−1
≤1−
i
i=1
`X
t +1
1
2
− (1−α)/3 < µ
ci λi p̂t−1
,
i
k
k
i=1
P
t
which implies that µ `i=1
ci λi p̂t−1
≥ C ∗ since it follows from Assumption 1(c) that ci λi µ ≤ 1/k1−α for all i. Then, we have
i
the following inequalities
T
X
E [ZGt ] ≥
t=1
ρ 1−
=
ρ 1−
≥
ρ 1−
1
Z
k(1+2α)/3
1
k(1+2α)/3
T
X
t=1
Z
k(1+2α)/3
1
∗
∗
T
X
t=1
Z
∗
T
X
t=1
"
E 1 (Ft = 1 ; `t ≤ IU ) · µ
"
(
gt
X
(
ci λi pi
i=1
E 1 (Ft = 1 ; `t ≤ IU ) µ
"
#
gt
X
ci λi p̂t−1
i
i=1
∗
E 1 (Ft = 1 ; `t ≤ IU ) C + µ
+µ
gt
X
ci λi pi −
i=1
gt
X
i=1
ci λi pi −
p̂t−1
i
)#
p̂t−1
i
)#
where the last inequality follows from the fact that when Ft = 1, gt = `t . Therefore,
T
X
E [ZGt ] ≥
t=1
1
ρ 1−
k(1+2α)/3
+ρ 1 −
≥
≥
1 (Ft = 1 ; `t ≤ IU )
2
Z E4
t=1
1−
2
Z E4
∗
1
k(1−α)/3
1−
2
Z E4
(1 − γt )
t=1
3
t−1
ci λi µ pi − p̂i 5
i=1
1
k(1−α)/3
X
T
(1 − γt )
t=1
X
ci λi µ
i=1
3
t−1
pi − p̂
5
i
X
max{gt ·1(Ft =1 ; `t ≤IU )}
∗
k(1+2α)/3
X
T
T gt ·1(Ft =1
X
X; `t ≤IU )
3
t−1
ci λi µ pi − p̂i 5
i=1
t=1
1
#
t−1 ci λi µ pi − p̂i
i=1
T gt ·1(Ft =1
X
X; `t ≤IU )
gt
X
P {Ft = 1} P {`t ≤ IU }
∗
1
k(1+2α)/3
−ρ 1 −
k(1+2α)/3
Z∗C∗ρ 1 −
T
X
t=1
1
=
Z∗C
T
X
∗
#
i=1
t=1
1
k(1+2α)/3
−ρ 1 −
"
ci λi µ pi −
p̂t−1
i
E [1 (Ft = 1 ; `t ≤ IU )]
Z E
k(1+2α)/3
Z∗C∗ρ 1 −
T
X
∗
1
1 (Ft = 1 ; `t ≤ IU )
gt
X
t=1
k(1+2α)/3
T
X
t=1
Z∗C∗
k(1+2α)/3
−ρ 1 −
Z E
1
1
ρ 1−
"
k(1+2α)/3
−ρ 1 −
E [1 (Ft = 1 ; `t ≤ IU )]
∗
k(1+2α)/3
1
ρ 1−
T
X
t=1
1
=
Z∗C∗
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
where the first equality follows
P from the fact that `t and Ft are independent. The second inequality follows from Lemma 8.
Note that we assume that 0i=1 ci λi µ(pi − p̂t−1
) = 0.
i
Note that when Ft = 1, gt = `t . Using the fact that ci λi µ ≤ 1/k1−α for all i, we have that
2
E4
X
ci λi µ
i=1
3
2
IU
X
t−1
pi − p̂
5 ≤ µ
ci λi E 4
i
X
max{gt ·1(Ft =1 ; `t ≤IU )}
i=1
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
3
t−1
pi − p̂
5
i
X
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
Putting everything together, we have that
PT
E [ZGt ]
T Z∗
t=1
≥
ρC ∗ 1 −
1
1−
k(1+2α)/3
2
1 X
− ρ
ci µλi E 4
T i=1
IU
≥
ρ 1−
1
k(1+2α)/3
−ρ 1 −
≥
ρ 1−
−
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
−ρ 1 −
k(1−α)/3
k(1+2α)/3
1
k(1+2α)/3
3
t−1
pi − p̂
5
i
X
T
1 X
(1 − γt )
T t=1
2
IU
1 X
ci µλi E 4
T i=1
1
3
− (1−α)/3
k(1+2α)/3
k
T
1 X
(1 − γt )
T t=1
k(1−α)/3
3
1
1
−
3
t−1
pi − p̂
5
i
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
PT
X
t=1
γt
T
2
IU
1 X
ci µλi E 4
T i=1
X
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
3
t−1
pi − p̂
5
i
where is the desired result. Note that the second inequality follows from the fact that
C
∗
1−
1−
=
1−
=
+ −
1−
=
−
k(1−α)/3
1
2
1
− (1−α)/3 − 1−α
k
k
k
2
1
1
− (1−α)/3 − 1−α
k
k
k
1
k(1+2α)/3
2
k(1−α)/3
−
+
1
k(1+2α)/3
1
k(1−α)/3
2
k(2+α)/3
−
1−
1−
1
1
k(1+2α)/3
−
3
2
1
− (5−2α)/3
k(4−α)/3
k
1
3
≥ 1 − (1+2α)/3 − (1−α)/3
k
k
+ −
3
k(2+α)/3
+
k(1−α)/3
−
k(2−2α)/3
+
+
−
k(1+2α)/3
2
k(1−α)/3
1−
k(1+2α)/3
k(2+α)/3
1
1
+
+
3
+
G.3
1
3
1
1
1
+ (5+α)/3 + (5−2α)/3 + 1−α
k
k
k
k
1−
=
1−
k(1+2α)/3
1−
=
1
2
k
+
3
k(2+α)/3
−
k(1−α)/3
1
k(1−α)/3
+
1
k(2+α)/3
1
1
1
1
+ (4+2α)/3 + (4−α)/3 − (5+α)/3
k
k
k
k
+ −
1
k1−α
2
k(4+α)/3
3
k
1
+
+
2
1
− 1−α
k
k(2−2α)/3
+
1
k(4−α)/3
2
1
k(4+2α)/3
+
+
k(4−α)/3
−
+
1
k(4−4α)/3
2
+
k(2−2α)/3
−
1
k(5−2α)/3
1
k(4−4α)/3
1
k(5+α)/3
1
k(4−4α)/3
Proof of Lemma 10
The proof of Lemma 10 makes use of the following results. The first result provides an upper bound on the sample average
for a Bernoulli random variable. Since the result follows from a standard application of Chernoff bound (cf. [20]), we omit
the proof.
Lemma 11. Let X1 , X2 , . . . , Xn be independent and identically distributed Bernoulli random variables with a parameter p,
i.e. for all i, P{Xi = 1} = 1 − P{Xi = 0} = p. Then, for any 0 < < 1,
E p −
n
2
1X Xi ≤ + 2e−n /3 .
n i=1 The next result is a restatement of the Berstein’s Inequality for real-valued random variables. For a proof, the reader is
refer to Appendix B in [22].
Theorem 3. Let Y1 , P
. . . , Yn be independent nonnegative
variables and 0 ≤ Yi ≤ M for all i. For any i, let
Pn random
2
σi2 = V ar(Yi ). Let Y = n
i=1 Yi and µ = E [Y ]. If V ≥
i=1 σi , then for any 0 < δ < 1,
(
P {Y ≤ (1 − δ) µ} ≤ exp
− 12 δ 2 µ2
V + 13 M δµ
)
Using the fact that, if a, b and c are nonnegative, the function x 7→ (ax2 )/(b + cx) is increasing in x for x ≥ 0, the corollary
below follows immediately from the theorem.
Corollary 1. Let YP
variables and 0 ≤ Yi ≤ M for all i. For any i, let
1 , . . . , Yn be independent nonnegative
Pn random
2
σi2 = V ar(Yi ). Let Y = n
i=1 Yi and µ = E [Y ]. If V ≥
i=1 σi , then for any 0 < δ < 1 and 0 ≤ η ≤ µ,
(
P {Y ≤ (1 − δ) η} ≤ exp
− 12 δ 2 η 2
V + 13 M δη
)
The next lemma establishes an upper bound on the expected difference between pi and p̂t−1
.
i
Lemma 12. For any keyword i and 0 < , δ < 1,
2
E4
X
3
pi − p̂t−1 5
i
≤
T +
1 − exp
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
+
−δ 2 λi ρ2
1
1−
2
2 1
k(1+2α)/3
1 − exp −2 (1 − δ)λi µρ 1 −
1
k(1+2α)/3
4
,
3
Proof. Let 0 < , δ < 1 be given. Note that when Ft = 1, we have that gt = `t . Therefore, it suffices to consider only
keywords i such that i ≤ IU . So, let any i ≤ IU be given. To prove the desired result, it suffices to show that the upper
bound holds for any arbitrary set of periods 1 ≤ t1 < t2 < · · · < tw ≤ T such that
gth · 1 (Fth = 1, `th ≤ IU ) ≥ i,
∀h = 1, . . . , w.
Note that during time periods t1 , . . . , tw , we have selected keywords based on the prefix ordering (i.e. Fth = 1 and gth = `th
for all h) whose largest index gth includes i and is less than or equal to IU .
We will show that, for any t1 , . . . , tw satisfying the above inequality, the conditional expectation
2
X
E4
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
3
pi − p̂t−1 (t1 , `t , Ft ) , . . . , (tw , `t , Ft ) , 5
w
w
i
1
1
is bounded above by the expression given in Lemma 12. For ease of exposition, we will drop the reference to the conditioning
information (t1 , `t1 , Ft1 ) , . . . , (tw , `tw , Ftw ) when it is clear from the context.
Recall that Yi (t) denotes the total cumulative impressions that the ad associated with keyword i has received at the end of
period t. Conditioned on (t1 , `t1 , Ft1 ) , . . . , (tw , `tw , Ftw ), we have that
2
X
E4
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
=
=
w
X
h=1
w
X
t −1
E p̂ih
3
pi − p̂t−1 (t1 , `t , Ft ) , . . . , (tw , `t , Ft )5
w
w
i
1
1
− pi th −1
E p̂i
− pi · 1 Yi (th − 1) ≤ (1 − δ)(h − 1)λi ρµ 1 −
h=1
+
w
X
t −1
E p̂ih
w
X
P
h=1
+
≤
w
X
2
k(1+2α)/3
1
1 − exp −2 (1 − δ)λi ρµ 1 −
1
h=1
3
k(1+2α)/3
Yi (th − 1) ≤ (1 − δ)(h − 1)λi ρµ 1 −
+ T +
1
+ 2 exp −2 (1 − δ)(h − 1)λi ρµ 1 −
P
k(1+2α)/3
k(1+2α)/3
h=1
1
1
Yi (th − 1) ≤ (1 − δ)(h − 1)λi ρµ 1 −
w X
k(1+2α)/3
− pi · 1 Yi (th − 1) > (1 − δ)(h − 1)λi ρµ 1 −
h=1
≤
1
(2)
3
k(1+2α)/3
where the first inequality follows from Lemma 11 and the final inequality follows from the formula for the geometric series.
Thus, to establish the desired result, it suffices to show that
w
X
P
Yi (th − 1) ≤ (1 − δ)(h − 1)λi ρµ 1 −
h=1
1
k(1+2α)/3
≤
1
1 − exp −δ 2 λi ρ2 1 −
1
k(1+2α)/3
2 4
Gt
Recall that Br h denote the remaining account balance in period th just before the arrival of the rth search query (in period
t
th ), assuming that we bid on keywords in Gth = {1, . . . , gth }. And Qrh denotes the keyword associated with the rth search
query in period th . For any 1 ≤ h ≤ w, let Uh be defined by
min{S th ,bµc}
Uh
X
≡
Gth
1 Qtrh = i, Br
≥ ci
r=1
=
∞
X
Gth
≥ ci , min S th , bµc ≥ r
Gth
≥ ci , S th ≥ r
1 Qtrh = i, Br
r=1
=
bµc
X
1 Qtrh = i, Br
r=1
The random variable Uh provides a lower bound on the number of impressions that keyword i receives in period th . Thus,
for any 1 ≤ h ≤ w,
h−1
X
Uv ≤ Yi (th − 1) .
v=1
Note that, conditioned on (t1 , `t1 , Ft1 ) , . . . , (tw , `tw , Ftw ), the random variables U1 , U2 , . . . , Uw are independent because Si ’s
are i.i.d. and the distribution of Uh depends only on Gth which is fixed given our conditioning information. We can thus
P
apply the result of Corollary 1 to the random variable h−1
v=1 Uv . To do so, we need to determine its expectation and variance.
Conditioned on t1 , . . . , tw , we will show that for any 1 ≤ h ≤ w,
E
"h−1
X
#
1
Uv ≥ (h − 1)λi µρ 1 −
k(1+2α)/3
v=1
h−1
X
and
V ar [Uv ] ≤ (h − 1)λi µ2 .
v=1
To prove this result, note that for any 1 ≤ h ≤ w,
E [Uh ]
=
bµc
X
n
Gth
λi P Br
≥ ci , S th ≥ r
o
r=1
≥
=
λi 1 −
k(1+2α)/3
th
λi E min S , µ
=
X
bµc
1
λi µρ 1 −
P S th ≥ r
r=1
1
k(1+2α)/3
1−
1
k(1+2α)/3
,
where the inequality follows from Lemma 5 and the fact that Gth = {1, 2, . . . , gth } and gth ≤ IU for all 1 ≤ h ≤ w. Note that
since gth ≤ IU for all h, it follows from the definition of IU that
g
th
X
µ
cu λu pu ≤ 1 −
u=1
1
1
− (1−α)/3 ,
k
k
which is the required hypothesis for the application of Lemma 5. The above result implies that for any 1 ≤ h ≤ w,
E
"h−1
X
#
Uv ≥ (h − 1)λi µρ 1 −
v=1
1
,
k(1+2α)/3
Moreover, conditioned on (t1 , `t1 , Ft1 ) , . . . , (tw , `tw , Ftw ), for any 1 ≤ h ≤ w,
V ar [Uh ] ≤
=
E Uh2
20
6B
min{S th ,bµc}
X
E 4@
Hth
1 Qtrh = i, Br
12 3
C 7
≥ ci A 5
r=1
20
≤
E 4@
bµc
X
12 3
1 Qtrh = i A 5
r=1
=
2
bµc
X
E4
1 Qtrh = i +
r=1
X
3
1 Qtrh = i, Qtqh = i 5
1≤q,r≤bµc:q6=r
=
λi bµc + λ2i bµc(bµc − 1)
≤
λi µ2
where the last equality follows from the fact that, for r 6= q, the arrival of the rth and q th queries are independent of each
other. The final inequality follows from our assumption that µ > 1. The above result implies that
h−1
X
v=1
V ar [Uv ] ≤ (h − 1)λi µ2 .
Note that 0 ≤ Uh ≤ µ for all h and i. Therefore, by applying Corollary 1, we can conclude that for any 1 ≤ h ≤ w,
P
Yi (th − 1) ≤ (1 − δ)(h − 1)λi µρ 1 −
1
≤
k(1+2α)/3
P
exp
where the last inequality follows from the fact that 1 + 13 δ 1 −
Hence,
Yi (tih − 1) ≤ (1 − δ)(h − 1)λi µρ 1 −
P
h=1
1
k(1+2α)/3
k(1+2α)/3
− 12 δ(h − 1)λi µρ 1 −
)
1
9
>
=
i2
1
k(1+2α)/3
h
i
1
>
>
: (h − 1)λi µ2 + 31 µδ (h − 1)λi µρ 1 − (1+2α)/3
;
k
8
9
2
1
>
>
<
=
− 12 δ 2 (h − 1)2 λ2i µ2 ρ2 1 − k(1+2α)/3
exp
1
>
>
: (h − 1)λi µ2 + 31 δ(h − 1)λi µ2 ρ 1 − (1+2α)/3
;
k
8
2 9
1
>
>
=
< − 12 δ 2 (h − 1)λi ρ2 1 − (1+2α)/3
k
exp
1
>
>
;
:
1 + 13 δρ 1 − k(1+2α)/3
9
8
2
1
>
>
=
< δ 2 (h − 1)λi ρ2 1 − (1+2α)/3
≤
h
exp
=
Uv ≤ (1 − δ)(h − 1)λi µρ 1 −
8
>
<
=
v=1
≤
w
X
(h−1
X
k
−
>
:
1
k(1+2α)/3
≤
>
;
4
≤ 2.
8
>
< δ 2 (h − 1)λi ρ2 1 −
w
X
exp
h=1
−
>
:
1 − exp
k
>
;
4
≤
2 9
>
=
(1+2α)/3
1
−δ 2 λi ρ2
1
1−
2 1
k(1+2α)/3
4
where the last inequality follows from the standard bound for the geometric series. Since (t1 , `t1 , Ft1 ) , . . . , (tw , `tw , Ftw ) are
arbitrary, the desired result follows.
Finally, here is the proof of Lemma 10.
Proof. It follows from Lemma 12 that
IU
X
2
ci λi µE 4
i=1
≤
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
IU
X
ci λi µT +
i=1
≤
3
pi − p̂t−1 5
i
X
i=1
IU
X
ci λi µT +
i=1
= T
IU
X
IU
X
1 − exp
IU
X
i=1
−δ 2 λi ρ2
δ 2 ρ2
k(1+2α)/3
2 +
1
1
k(1+2α)/3
2
IU
X
2 +
1
2 (1 − δ)ρ 1 −
2 (1
1 − exp
1
k(1+2α)/3
6
− δ)ρ 1 −
2ci λi µ
6ci λi µ
ci +
i=1
4
IU
X
i=1
IU
X
i=1
k(1+2α)/3
8µ
1−
1−
8ci λi µ
δ 2 λi ρ2 1 −
ci λi µ +
i=1
ci λi µ
1
k(1+2α)/3
−2 (1
− δ)λi ρµ 1 −
(1 − e−λi µ )
IU
X
i=1
ci λi µ
,
(1 − e−λi µ )
where the last inequality follows from the fact that for any 0 ≤ x ≤ 1 and any d ≥ 0,
1
2
≤
1 − e−x
x
and
1
1
≤
,
1 − e−dx
x (1 − e−d )
which implies that
1
1 − exp −δ 2 λi ρ2 1 −
1
k(1+2α)/3
2 ≤
4
8
δ 2 λi ρ2 1 −
1
k(1+2α)/3
2
1
k(1+2α)/3
3
and
1
1 − exp
−2 (1
− δ)λi µρ 1 −
≤
1
k(1+2α)/3
3
2 (1 − δ)ρ 1 −
3
1
k(1+2α)/3
(1 − e−λi µ )
Using our assumption that ci ≤ 1/k and λi µ ≤ kα for all i, we can conclude that for any 0 < , δ < 1,
IU
X
2
ci λi µE 4
i=1
3
t−1
5 ≤ T IU +
pi − p̂
i
1−α
X
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
k
8µIU
δ 2 kρ2 1 −
2 +
1
k(1+2α)/3
6
PIU
i=1
λi µ/ 1 − e−λi µ
2 k(1 − δ)ρ 1 −
1
,
k(1+2α)/3
which is the desired result.
Set = k(2−2α)/3 /IU . By definition of IU , it is easy to verify that < 1. Then, we have that for any 0 < δ < 1,
IU
X
i=1
2
ci λi µE 4
X
t≤T :gt ·1(Ft =1 ; `t ≤IU )≥i
which gives the desired result.
3
t−1
5 ≤
pi − p̂
i
T
k(1−α)/3
+
8µIU
δ 2 kρ2 1 −
1
k(1+2α)/3
2 +
6IU2
PI U
i=1
λi µ/ 1 − e−λi µ
k(7−4α)/3 (1 − δ)ρ 1 −
1
k(1+2α)/3

Download Report

An Adaptive Algorithm for Selecting Profitable Keywords for Search

Paperzz.com

Your Paperzz