Efficient Vote Elicitation under Candidate Uncertainty

Efficient Vote Elicitation
under Candidate
Uncertainty
JOEL OREN, UNIVERSITY OF TORONTO
JOINT WORK WITH YUVAL FILMUS AND CRAIG
BOUTILIER
Motivation: Hiring Committee
• A hiring committee expresses its preferences
over job applicants.
• Do we really need their rankings over the
entire set of applicants?
• Twist: Some candidates may be unavailable –
can accept offers from other places.
• Gather enough information about the
preferences before unavailability of
candidates is revealed so as to determine the
right winner with available information after
unavailability information is revealed.
Candidates:
𝑎1
𝑎2
𝑎3
Winner: 𝑎12
𝑎1 ≻ 𝑎2 ≻ 𝑎3 𝑎3 ≻ 𝑎2 ≻ 𝑎1
𝑎2 ≻ 𝑎1 ≻ 𝑎3
2
The Basic Setting:
Set 𝐶 = {𝑎1 , … , 𝑎𝑚 } 𝑚 candidates (alternatives).
Set N = {1, … , 𝑛} of 𝑛 voters with preferences over the candidates:
𝜎1 , … , 𝜎𝑛 ∈ 𝐿𝑚 .
Voting rule: 𝑟: 𝐿𝑛𝑚 → 𝐶 - aggregates the preferences for selecting a
“winning” candidates.
𝑎1
𝑪
𝑎3
𝑎2
2
1
0
𝑎1 ≻ 𝑎2 ≻ 𝑎3
Score-based rules: each candidate 𝑖 ∈ 𝑁 receives a score 𝑠𝑐(𝑖). The
winner is the candidate with maximal score.
◦ Plurality: 𝑠𝑐 𝑃 𝝈, 𝑐 = #{i ∈ 𝑁: 𝜎𝑖 1 = 𝑐}.
◦ Borda: 𝑠𝑐 𝐵 𝝈, 𝑐 = ∑𝑖∈𝑁 𝑚 − 𝜎𝑖−1 (𝑐).
Candidate uncertainty: each candidate is available with probability 𝑝.
3
A Distributional Model of
Preferences
•The Mallows Model: a prior distribution over the voters’
preferences.
∗
•Reference ranking: 𝜎 ∈ 𝐿𝑚 .
𝝋
𝝈∗ :
𝒂𝟏 ≻ ⋯ ≻ 𝒂𝒊 ≻ 𝒂𝒊+𝟏 ≻ ⋯ ≻ 𝒂𝒎
•Dispersion parameter: 𝜑 ∈ [0,1].
•Kendall’s 𝜏-distance: 𝑑𝐾𝑇 𝜎, 𝜎 ∗ – # of pairwise inversions between
𝜎 ∗ and 𝜎.
•Pr 𝜎 ∝ 𝜑𝑑𝐷𝐾
𝜎,𝜎 ∗
- probability decays with # of pairwise swaps.
•Impartial Culture (IC): 𝜑 = 1 – uniform distribution over rankings.
𝝈′ :
𝒂𝟏 ≻ ⋯ ≻ 𝒂𝒊 ≻ 𝒂𝒊+𝒕 ≻ 𝒂𝒊+𝟏
≻ ⋯ ≻ 𝒂𝒎
𝒅𝑲𝑻 𝝈∗ , 𝝈′ = 𝒕 − 𝟏
•Assume that each ranking is sampled i.i.d. from a Mallows
distribution.
4
Efficient Preference Elicitation
How much information do we need about the voter
preferences?
Popular heuristic - Top-k voting: each voter 𝑖 ranks only his
top 𝑘 candidates – the 𝑘-length prefix of 𝜎𝑖 .
𝑎1 ≻𝑎𝑎12≻≻𝑎⋯
≻⋯
𝑎𝑘≻≻𝑎𝑎𝑘𝑘+1 ≻ ⋯ ≻ 𝑎𝑚
2 ≻
For Plurality: If 𝑝 = 1, 𝑘 = 1 is enough. But what if 𝑝 < 1?
For Borda: each position (except the last) awards the
candidate additional points.
Main question: what is the minimal value of 𝑘 needed
to determine the correct winner (over distribution of
available candidates) w.h.p? (1 − 𝑜(1))
5
Previous Work
Preference elicitation for winner determination: top-k is one popular method studied
recently (Kalech et al. 2011, Konczak and Lang 2005, Lu and Boutilier 2011).
◦ Good empirical performance.
◦ Not a lot of theoretical guarantees.
◦ Communication complexity of various voting rules (Conitzer and Sandholm, 2005).
Candidate uncertainty:
◦ Lu & Boutilier 2011, Baldiga & Green, 2011: how well does the Borda winner of the
original set 𝐶 approximate the score of the true of the true winner.
◦ Boutilier et al. 2012: Efficient candidate querying policies for winner determination.
6
Top-𝑘 preference elicitation –
Plurality and Borda
1
• 𝑚 = 10, 𝑝 = 2 , 𝜑 = 0.7 - relatively diffuse (𝜑 = 1- uniform).
• Successful prediction is high for low values of 𝑘 under high variability.
7
Main Results
𝑚 ≡ # of candidates, 𝑛 ≡ # of voters.
Pluarlity
Adversarial
Probabilistic
(IC)
𝒏 = 𝒑𝒐𝒍𝒚(𝒎):
UB: 𝑘 = 𝑂(log 𝑚)
𝒏 = 𝐞𝐱𝐩(𝒎):
LB: 𝑘 = Ω 𝑚
𝒏 = 𝒑𝒐𝒍𝒚(𝒎):
UB: 𝑘 = 𝑂(log 𝑚)
𝒏 = 𝒆𝒙𝒑(𝒎):
Tight bound:𝛩(log 𝑚)
Borda
(𝒏 = 𝛀(𝐦𝟑 𝐥𝐨𝐠 𝐦))
LB: 𝑘 = Ω(𝑚)
UB: 𝑘 = Ω
𝑚
log 𝑚
The bounds for the
Borda case hold even
when 𝑝 = 1.
8
Results (continued): Zero
Elicitation
•Voting – two main uses:
1. Aggregation of information: infer underlying ground-truth based on induced votes.
2. Aggregation of preferences: select the best outcome (or ranking) – optimize societal
payoff.
•Task: Given a Mallows distribution 𝜎 ∗ , 𝜑 ,over 𝑚 candidates, 𝑛 voters and a voting rule 𝑟(⋅),
how large does 𝑛 need to be so that first candidate in 𝜎 ∗ is the winner with high probability?
•Results:
log 𝑚⋅ 1−𝜑𝑚
∗ is the winner
1. Plurality: For n = Ω
,
the
highest
ranked
candidate
in
𝜎
3
1−𝜑
w.h.p.
2. Borda: For 𝑛 ≥ 𝑓 𝜑 ln 𝑚, (𝑓 𝜑 = 𝑝𝑜𝑙𝑦 𝜑 ), the highest ranked candidate is the winner
w.h.p.
9
Top-k Voting: Bounds for
Plurality
Basic question: what is the minimal 𝑘 needed to guess correct plurality
winner w.h.p., when each candidate is available with probability 𝑝?
Warmup: Upper bound of 𝑂 log 𝑚 , small number of voters: 𝑛 = 𝑝𝑜𝑙𝑦 𝑚
 Set 𝑘 =
2 log 𝑛
1
log 1−𝑝
.
 For each voter 𝑖 ∈ 𝑁:
𝜎𝑖 (1) 𝜎𝑖 (2)
𝜎𝑖 (𝑘)
⋯
⋯
𝜎𝑖 (𝑚)
All unavailable w.p.
1 − 𝑝 𝑘 = 1/𝑛2
 Eliciting a 𝑘-length prefix guarantees (w.h.p.) that the true plurality vote of the
voter is known.
10
Bounds for Plurality – Large Voter
Populations (adversarial case)
What happens if 𝒏 = 𝐞𝐱𝐩(𝒎)?
Theorem: Must take 𝑘 = Ω(𝑚) (p is a
constant).
𝐶
Proof sketch:
 Set of candidates 𝑎, 𝑏 ∪ 𝑐1 , … , 𝑐𝑚 .
𝑎 and 𝑏: “close competitors”, 𝑝 = 1 2.
 Assume: 𝑎, 𝑏 always available.
 Set 𝐻 = {𝑆 ⊆ 𝐶: 𝑆 =
𝑚
}
2
 Create three sets of rankings:
1. 𝑉1 : 𝑯 + 𝟏 copies of:
2.
𝑯 copies of:
𝒃 ≻ 𝑒𝑣𝑒𝑟𝑦𝑡ℎ𝑖𝑛𝑔 𝑒𝑙𝑠𝑒 …
either 𝑎 or 𝑏 wins.
3. 𝑉2 : for every set 𝑆 ∈ 𝐻: create two copies of
× 𝟐:
Arbitrary ranking of
𝑆
≻𝒃
𝐶∖𝑆+𝑎
If 𝐴 is the set of available candidates, 𝐴 <
with probability roughly half  𝑏 wins.
𝑚
2
 if prefixes preceding 𝑏 in 𝑉2 are not visible –
prediction will be wrong with const. probability.
𝒂 ≻ 𝑒𝑣𝑒𝑟𝑦𝑡ℎ𝑖𝑛𝑔 𝑒𝑙𝑠𝑒 …
11
Plurality – Under the IC
Assumption
Assume: each ranking over 𝐶 – equally likely.
Theorem: If there are exponentially many voters and the IC assumption holds, 𝑘 =
Ω(log 𝑚) is required for correct winner determination.
Proof sketch: first, partition preference profile.
𝑉1
𝑐2 ≻ 𝑐1 …
𝒔𝒄𝑷𝟏 (𝒄)
Rankings with available
top-k candidates.
𝑉 𝑉2
𝑐2 ≻ 𝑐1 … … … 𝑐
𝒔𝒄𝑷 (𝒄)
𝒔𝒄𝑷𝟐 (𝒄)
Rankings with no available
top-k candidates.
12
𝑉1
𝑐2 ≻ 𝑐1 …
𝑐 ∈ 𝐶: 𝑠𝑐1𝑃 (𝑐)
𝑉2
𝑐2 ≻ 𝑐1 … … … 𝑐
𝑐 ∈ 𝐶: 𝑠𝑐2𝑃 (𝑐)
◦ Show: If k = 𝑜(log 𝑚), the true winner is not the winner according to 𝑉1
◦ Technique:
m
3
2
1
⋯⋯
◦ Let: 𝑠𝑐1𝑃 𝑐1 − 𝑠𝑐1𝑃 𝑐2 - observable difference in top-k plurality winner and
second highest candidate.
◦ Also: 𝑠𝑐2𝑃 𝑐2 − 𝑠𝑐2𝑃 𝑐1 - loss in difference due to discounted votes.
◦ Show: 𝑠𝑐2𝑃 𝑐2 − 𝑠𝑐2𝑃 𝑐1 ≥ 𝑠𝑐1𝑃 𝑐1 − 𝑠𝑐1𝑃 𝑐1 with constant probability
13
Zero Elicitation: A MaximumLikelihood Approach for
Preference Aggregation
•Provide bounds for more general prior distributions: 𝜑 < 1.
•Question: how large need the voter population need to be for a confident prediction?
log 𝑚 1−𝜑𝑚
1−𝜑 3
•Theorem: [plurality] If 𝑛 = Ω
then 𝜎 ∗ 1 is the true plurality winner
w.h.p.
• Proof idea: Prove a concentration bound on the difference between the scores of
𝜎 ∗ (1) and any other candidate’s score. Show it is positive w.h.p for given values of
𝑛.
•Theorem: [Borda] If 𝑛 ≥ 𝑓 𝜑 ln 𝑚, then the highest ranked candidate in reference
ranking 𝜑 ∗ 1 is the true Borda winner w.h.p., 𝑓 𝜑 = (8 1 + 𝜑 2 1 − 𝜑 3 +
1 + 𝜑 )/ 1 − 𝜑 7.
•A stronger result: for the above bound on 𝑛, we can predict entire aggregate ranking.
14
Zero-Elicitation: Empirical
Performance
15
Zero-Elicitation – Empirical
Performance – Complete Rank
Reconstruction
16
Conclusions
A rigorous analysis of top-𝑘 voting under both candidate uncertainty
and popular priors for preference distribution.
Plurality: If the voter population is relatively small - 𝑂(log 𝑚) is enough,
otherwise - k = Ω(𝑚) in the worst case, 𝑘 = Θ(log 𝑚) – under IC.
Borda: interesting even without candidate uncertainty element. 𝑘 =
𝑚
Ω(𝑚) – worst-case, 𝑘 = Ω
- under IC.
log 𝑚
Extreme case: zero-elicitation 𝑘 = 0 - a maximum-likelihood approach
to winner prediction.
Open problems:
 Similar analysis for other voting rules.
 Tighter bounds for Borda under IC.
 Other probabilistic models (Plackett-Luce, Mixture of Mallows)
17
Thank you

Download Report

Efficient Vote Elicitation under Candidate Uncertainty

Paperzz.com

Your Paperzz