Minimizing Regret with Multiple Reserves Joshua R. Wang (Stanford) Tim Roughgarden (Stanford) EC’16: Maastricht, The Netherlands July 27th, 2016 Reserve Prices • Seller has a single item and wants to maximize revenue. • [Myerson ‘81] if valuations are i.i.d. from regular distribution, use second-price auction with reserve price set to the monopoly price. • Bayesian Optimization: Given a prior distribution over bidder valuations, try to maximize expected revenue. Reserve Price Problems • Bayesian Optimization: Given a prior distribution over bidder valuations, maximize expected revenue. • Batch Learning: Unknown distribution 𝐹 over bidder valuations, given 𝑚 i.i.d. samples from 𝐹, maximize the expected revenue w.r.t. 𝐹. • Online no-regret Learning: Valuation profiles arrive one by one. At time 𝑡, choose an auction as a function of previously seen profiles v1 , v 2 , … , v 𝑡−1 . Maximize time-averaged revenue relative to the timeaveraged revenue of the best auction in hindsight. • Offline Optimization: Given 𝑚 valuation profiles, maximize the average revenue across these profiles. Non-Anonymous Reserve Prices • Natural Extension: For asymmetric bidders, use non-anonymous reserve prices. • Real-life examples of discriminating between bidders include ad quality and different opening bids for the FCC Incentive Auction [Cramton et al. ‘15] Non-Anonymous Reserve Prices • Natural Extension: For asymmetric bidders, use non-anonymous reserve prices. • We can again consider Bayesian optimization, batch learning, online no-regret learning, and offline optimization. • [Hartline and Roughgarden ‘09] studied Bayesian optimization where bidders valuations were independently but not identically distributed; setting each bidder’s reserve to the monopoly price for her distribution yields near-optimal expected revenue in many scenarios. • [Paes Leme et al. ‘16] showed that offline optimization is NP-hard. Eager Versus Lazy Reserve Prices Bidder 1 2 3 Valuation 6 4 2 Reserve 7 6 1 No Winner Winner Pays 1 We focus on eager reserve prices, which are superior from both a welfare and revenue standpoint [Paes Leme et al. ‘16]. Eager Lazy Offline Maximizing Multiple Reserves (MMR) • Offline Optimization Problem • Input: 𝑚 valuation profiles v1 , … , v 𝑚 • Output: vector r of reserve prices which maximize revenue • Without loss of generality, set 𝑟𝑖 to some 𝑗 𝑣𝑖 . • Brute-force: consider 𝑚𝑛 possible reserve vectors. Offline Maximizing Multiple Reserves (MMR) +2 -2 -2 6 4 2 6 4 Bidder 1 4 2 2 Bidder 2 6 Offline MMR Algorithm • Let 𝑆𝑖 denote profiles where 𝑖 has the highest valuation. • Let 𝑗 𝑣2 denote the second highest valuation in v𝑗 . • For each bidder 𝑖 = 1, 2, … , 𝑛: • Choose 𝑟𝑖 to maximize 𝑗 𝑟 . 𝑞 𝑖 𝑗∈𝑆𝑖 𝑞𝑗 5 = 1 𝑗 𝑣2 =4 • Return either r or the all-zero vector, whichever generates more revenue. 𝑞𝑗 3 = 0 6 Offline MMR Algorithm • Theorem 1. The offline MMR algorithm is a 1/2-approximation. • Also extends to matroid environments. Offline MMR Algorithm Proof • The total revenue OPT obtains is at most 𝑚 𝑞 5 =1 5 𝑗=1 𝑗 𝑣2 + 𝑛 𝑖=1 𝑗∈𝑆𝑖 𝑞 𝑗 𝑟𝑖∗ 𝑞 3 =0 3 6 4 Bidder 1 Bidder 2 • Our computed reserves r achieve at least the second term; the allzero vector achieves at least the first term. Offline MMR Lower Bound • Theorem 2. The MMR problem is NP-hard to approximate within a 1 − 𝜖 factor for some constant 𝜖 > 0. • Proof Idea: Reduction from a variant of Set Cover. Each set represented by a bidder; a low reserve means picking the set. • Using a result of [Chlebik and Chlebikova ‘08], we get that the offline MMR problem cannot be approximated better than 884/885 (even simply computing the revenue). Consequences for Batch MMR • Batch Learning Problem: • Unknown distribution 𝐹, draw 𝑚 i.i.d. samples. Algorithm uses samples to maximize expected revenue on 𝐹. • Computationally efficient batch learning is essentially efficiently computing the Empirical Risk Minimizer (ERM). • ERM is the offline MMR problem; which we showed was APX-hard. • Can use our 1/2-approximation algorithm with polynomial samples to obtain roughly half of the maximum possible revenue. Online Maximizing Multiple Reserves (MMR) • Online Optimization Problem • Every round, adversary chooses a valuation vector v 𝑖 and algorithm chooses a reserve vector r 𝑖 . • After 𝑇 rounds, compared against to the best fixed r in hindsight. • Number of actions (corresponding to reserve vectors) is exponential in 𝑛. • Cannot simply apply black-box techniques such as [Kakade et al. ‘09] because MMR is nonlinear. 𝛼-Regret [Kakade et al. 2009] • 𝛼-regret notion merges regret with 𝛼-approximation. • The (time-averaged) 𝛼-regret of a sequence r1 , … , r 𝑇 with respect to valuation profiles 𝑣 1 , … , 𝑣 𝑇 is 1 𝛼 ⋅ max r 𝑇 𝑇 𝑅 𝑡=1 r, v 𝑡 1 − 𝑇 𝑇 𝑅 r𝑡 , v𝑡 𝑡=1 • Goal is still to drive this towards 0 as quickly as possible, for 𝛼 as close to 1 as possible. Online MMR Algorithm • Based on Follow the Perturbed Leader (FTPL) [Kalai and Vempala ‘03]. • Choose 𝐾 = 𝑇 and 𝜖 = log 𝐾/𝑇. • For each bidder 𝑖 = 1,2, … , 𝑛: • Let 𝑆𝑖 denote the previous rounds 𝑖 had the highest valuation. • For each reserve price 𝑟 = 1/𝐾, 2/𝐾, … , 1: • Draw a random variable 𝑋𝑖,𝑟 from the standard exponential distribution. • Let 𝑌𝑖,𝑟 = ±𝑋𝑖,𝑟 uniformly at random. • Choose 𝑟𝑖𝑡 ∈ 1/𝐾, 2/𝐾, … , 1 to maximize 𝑌𝑖,𝑟 + 𝑗 𝑟𝑡 . 𝑞 𝑗∈𝑆𝑖 𝑖 • Return either r 𝑡 or the all-zero vector, each with 1/2 probability. Online MMR Algorithm • Theorem 3. The 1/2-regret of the online MMR algorithm is 𝑂 𝑛 log 𝑇 /𝑇 . Online MMR Algorithm Proof • Proof by Picture: 𝑇 OPT ≤ 𝑗=1 𝑗 𝑣2 all-zero reserves 𝑛 𝑖=1 𝑗∈𝑆𝑖 𝑞 𝑗 𝑟𝑖∗ FTPL for each bidder • Revenue decomposition allows us to move from exponential search space to linear search spaces. Online MMR Lower Bound • No 𝛼-regret: if for every fixed 𝜖 > 0, the number of rounds 𝑇 to drive the 𝛼-regret down to 𝜖 is bounded by a polynomial in the number of bidders 𝑛. 884 , 885 • Theorem 4. For all constants 𝛼 > there is no polynomial-time algorithm for online MMR with no 𝛼-regret, unless 𝑁𝑃 = 𝑅𝑃. • Similar to [folklore] and [Daskalakis and Syrgkanis ‘16]. Open Questions • Extend algorithms beyond matroid environments? • Consider other classes of auction (e.g. further discretized virtual welfare maximizers). • Improper learning problem? See [Devanur Huang Psomas ‘16] for general revenue maximization for unknown distribution with samples.
© Copyright 2026 Paperzz