Game Theoretic Approach on Avoiding Sample Bias in Machine

Game Theoretic Approach on
Avoiding Sample Bias in
Machine Learning
By Ming Gu
2013.3.20
Background
Smart Building
 Sensors
 Actuators

Problem Statement

We are learning a real-value function on
an input space 𝑋 based on the labels from
a group of agents 𝑁 = 1,2, … , 𝑛

Each agent 𝑖 ∈ 𝑁 has his own value
function on the input space 𝑣𝑖 (𝒙), which
is unknown to others.
Problem Statement
We have a hypothesis space 𝐹, where
every 𝑓 ∈ 𝐹 is a function from 𝑋 → 𝑅, we
are searching for the optimal 𝑓
 The accuracy is measured by loss function
𝑙: 𝑅 × 𝑅 → 𝑅+ , 𝑙(𝑓 𝑥 , 𝑦) is the loss
associate with the prediction 𝑓(𝑥) while
the true output is 𝑦.

Problem Statement

Agent 𝑖 has a risk with function 𝑓
◦ 𝑅𝑖 𝑓 = 𝐸[𝑙 𝑓 𝑥 , 𝑣𝑖 (𝑥) ]

The global risk is
◦ 𝑅𝑔𝑙𝑜𝑏𝑎𝑙 𝑓 =

1
𝑛
𝑛
𝑖=1 𝑅𝑖 (𝑓)
Every agent is selfish, aiming at minimize
𝑅𝑖 (𝑓), while the mechanism designer is
minimizing 𝑅𝑔𝑙𝑜𝑏𝑎𝑙 (𝑓)
EMR Solution
We obtain a training set 𝑆𝑖 =
𝑚
𝑥𝑖𝑗 , 𝑦𝑖𝑗
from each agent 𝑖, the
𝑗=1
global training set 𝑆 =∪𝑁 𝑆𝑖
 Every agent and the mechanism designer
is using empirical risk as a estimation to
risk.

◦ 𝑅𝑖 =
1
𝑆𝑖
◦ 𝑅𝑔𝑙𝑜𝑏𝑎𝑙 =
𝒙,𝑦 ∈𝑆𝑖 𝑙(𝑓
1
𝑆
𝒙 , 𝑦)
𝒙,𝑦 ∈𝑆 𝑙(𝑓
𝒙 , 𝑦)
EMR Solution

We are finding
◦ 𝑓 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑓∈𝐹 𝑅𝑔𝑙𝑜𝑏𝑎𝑙 (𝑓, 𝑆)
◦ If more than one fit, choose the one with
smallest ||𝑓||

Question is: Are the agents motivated to
report untruthfully? i.e. 𝑦𝑖𝑗 ≠ 𝑣𝑖 (𝑥𝑖𝑗 )? If
not ,the mechanism is strategy-proof.
EMR Solution with Absolute Loss
Function

𝑙 𝑎, 𝑏 = |𝑎 − 𝑏|

If 𝐹 is convex, then the global risk
minimizing mechanism is strategy-proof
EMR Solution with Other Loss
Function

𝑙 𝑎, 𝑏 = 𝑎 − 𝑏

Example
2
◦ 𝑆1 = 𝒙𝟏 , 0 , 𝑆2 = 𝒙𝟐 , 2
◦ 𝐹 is all constant function
EMR Solution with Other Loss
Function

𝑙 𝑎, 𝑏 =

Example:
|𝑎 − 𝑏|
◦ 𝑆 = { 𝑥1 , 1 , 𝑥2 , 2 , 𝑥3 , 4 , (𝑥4 , 4)}
◦ 𝐹 is all constant function
EMR Solution with Other Loss
Function

General Result?
Machine Learning with Payment

Universal Equivalent
◦
◦
◦
◦
Loss Function
Grid Power Price
Real Reward\Penalty
Virtual Reward\Penalty
Machine Learning with Payment

Basic Idea
◦ Introduce external payment to align the
interest of individual agent and the mechanism
designer.
Machine Learning with Payment

Basic Idea
◦ The outcome is (𝑓, 𝒑) rather than 𝑓
◦ 𝒑 = (𝑝1 , … , 𝑝𝑛 ) is the payment vector
◦ Agent 𝑖’s utility function
 𝑢𝑖 𝑓, 𝒑 = −𝑅𝑖 𝑓, 𝑆 − 𝑝𝑖
VCG Mechanism
 𝑝𝑖 =
◦ optimal welfare (for the other players) if agent
𝑖 was not participating –
◦ welfare of the other players from the chosen
outcome
VCG Mechanism
 𝑝𝑖
=
𝑗≠𝑖 𝑅𝑗
𝑓, 𝑆 − (𝑛 − 1) 𝑅(𝑓, 𝑆−𝑖 )

𝑓 = 𝑎𝑟𝑔𝑚𝑖𝑛𝑓∈𝐹 𝑅(𝑓, S−i )

𝑢𝑖 𝑓, 𝒑 = −𝑅𝑖 𝑓, 𝑆 − 𝑝𝑖 =
−𝑅𝑖 𝑓, 𝑆 − 𝑗≠𝑖 𝑅𝑗 𝑓, 𝑆 + (𝑛 −
Future Work

Do we have result on small amount of
training set which is not enough to obtain
training set of identical size from each
agent?
Future Work

What is the result for asymmetric loss
function?

Is it feasible to let the agents report their
loss function? And how to define/ensure
strategy-proof?
Future Work

What is the effect of regularity?
Future Work

What’s the result for machine learning
problems of other kinds?
◦ Learning target
◦ Learning algorithm
Questions