Fair Allocation with Succinct Representation

Fair Allocation with Succinct
Representation
Azarakhsh Malekian (NWU)
Joint Work with
Saeed Alaei, Ravi Kumar, Erik Vee
UMD
Yahoo! Research
Online Advertising
query=travel
Sponsored links
Main Problem search engines face: 4 slots
Which ad to show for which query
Subject to:
Maximizing revenue
Maximizing user Safisfaction
2
Categories of Advertisers
 Non-Guaranteed Delivery (small advertiser)
 Main purpose:



Selling your item
An action from user
Allocation is not guaranteed
 Guaranteed Delivery (large advertiser)
 Main purpose:

Brand recognition
 Contracts
 Ask for a minimum # impressions: fixed price per item
 Prepaid charge
I want 10K
impressions
We focus on Guaranteed Delivery
in this
talkper day for
august to users from california!
Can we sign a contract?
3
Introduction
 We have a set of advertisers and a set of impression types
(buckets).
 Each advertiser is only
 Interested in impressions of certain types.
 Required minimum number of impression from its desired
buckets
 For each impression:
 There is only a limited number of impressions available
 Furthermore
 Advertisers want the allocation to be representative of the supply
as much as possible.
 Due to the online nature of the problem and the huge size of
data, we are seeking a solution:
 Can be represented by a compact plan
 Can be reconstructed efficiently in real time
4
Justifying Representativeness
 Each bucket has some of user attributes explicitly
 The unspecified ones are subject to interpretation
 Most often, advertisers are equally interested in all the
users who belong to the bucket
 Example: It is undesirable to assign old men to a Sport car
dealer interested in men
 There can be a large number of attributes at
different level of granularity
 It is not fully possible for the advertiser to specify the
desired bucket to the finest conceivable detail
 Example: Toy store
Agenda
 Formal Problem Definition
 Our Main Results
 Compact plan
 Reconstructing the original solution in
constant time
6
20
20
finding an allocation that minimizes the distance from
the ideal fair allocation
We use L1 distance function
20
Fair: 15
dj=30
wj= 1
Fair: 10
Goal:
J: set of contracts (advertisers)
I: set of impression types (buckets)
dj: Total demand of contract j
wj: weight of contract j
si: Total supply of impression i
Fair: 10





ij: 15
Problem Definition
We are interested in a method that:
•Can compute the allocation efficiently
•Can store the allocation succinctly
7
Main Results
 An efficient combinatorial solution for finding allocation that
minimizes L1 distance using min cost flow
 A compact representation of the solution requiring only linear
space in number of impression types and advertisers (as
opposed to quadratic)
 Reconstructing the allocation in constant time
 Robustness
 Experimental Results
 Also:
 We compute the approximation ratio of greedy
 Experimental results
 Combinatorial way of computing succint plan for L2 distance
function (Based on the solution of Vee et al [VVS10]
Formal Model (LP Formulation)





J: set of contracts
I: set of impression buckets
dj: Total demand of contract j
wj: weight of contract j
si: Total supply of impression i
The allocation
9
Idea
9
 Consider the perfectly fair
allocation (possibly infeasible)
 To make it feasible
 reassign the overfilled portions of
the contracts to other buckets
with available capacity.
3
5
 If we remove xij for contract j it
increases the objective by
2wj xj
12
6
6
10
6
10
Overfull: Should
reassign 2
6
3
5
10
6-2
6+2
10
10
Min Cost Flow Solution
 Theorem:
The min cost solution to the flow network on left is the solution
to the LP for L1 distance function.
Capacity ij
Cost 0
Capacity dj
Cost 0
Capacity 
Cost 2Wj
Capacity si
Cost 0
11
Compact Plan?
 Min cost flow can be computed efficiently
We still need to store the whole allocation
 The space required to store the allocation
plan should be linear in the number of
vertices.
 We should be able to reconstruct the flow
along each edge in constant time.
12
Reconstruction (Primary Steps)
 Writing the dual of min cost flow
Primal (min cost flow)
Dual
allocation
Dual variables
13
Reconstruction
 Compute the dual variables of the min cost flow LP.
 We only need O(|I|+|J|) space to store the dual
(Zi and Yj).
 The allocation along any edge (primal) can be
computed using dual and complementary slackness
except
 for a few slack edges.
 For the slack edges,
 we show how to compute an extra variable for each
vertex call it height which allows us:
 to reconstruct the flow along any slack edge.
14
Reconstruction: Network Flow
Solution
 Lemma:
 Aij= max(0, Zi - Yj)
 The value of x’ij in primal is
 0: if Zi - Yj < 0
 ij: if Zi - Yj > 0
 Zi - Yj =0 : slack edges
 Make a new instance of max flow problem on this set of edges.
 The cost of all max flow in the new network is the same.
 Find a height function for this network flow such that:
 Flow(i,j) = min(capacity, (h(i)-h(j))capacity)
16
Storing the Solution
 Height based Maximum Flow:
 We find a height function h(v) that assigns
height to each vertex such that:
18
Storing the Solution
 We find a height function h(v) that assigns
height to each vertex such that:
 We can approximate the above for any given
 in time polynomial in 1/
 The obtained solution is robust
19
Summary
 Compact Plan:
 Write the primal/Dual min cost flow
 Make a network flow instane on vertices with Zi - Yj = 0
 Compute the height for vertices of the flow
 Reconstruction:
 For each edge if:
 Zi - Yj ≠ 0 then it is either full or empty based on
the sign
 Zi – Yj =0 then use height function
 Flow(i,j) = min(capacity, (h(i)-h(j))capacity)
Experimental Results
 Data set:
 Actual impression buckets and advertiser contracts from Yahoo! Display
advertisement
 The results for the largest graph:
 Min Cost Flow is much faster than solving LP

178 seconds versus 4000 seconds

We only need to address this small proportion by height
 More than 99% percent of the edges are either empty or saturated in
practice, as a result:
21
Experimental Results
 Results on the rest of data sets:
22
Related Works
 Vee et al
 Strictly convex version of the problem
Given approximation of the online supply
 Find a reconstructible plan for other norms
 Using KKT method
 Focus on sampling aspects of the problem
 Gosh et al
Combined variant of guaranteed and non
guaranteed
A randomized mechanism
Future Directions
 Adapting our solution to highest degree norm
and comparing the results
 Consider the fair allocation from the
mechanism design point of view
When advertisers are strategic
25