Uncertainty and Dynamics - Nanyang Technological University

Advances in Game Theory for
Security and Privacy
Bo An
Fei Fang
Yevgeniy Vorobeychik
(Nanyang Technological University, [email protected])
(Harvard University->CMU, [email protected])
(Vanderbilt University, [email protected])
June 26, 2017@EC’17, MIT
(slides adapted from related tutorials and talks)
More resources: http://teamcore.usc.edu/projects/security
Global Challenges for Security
Key challenges: Limited resources, surveillance
2
Stackelberg Games
Randomization: Increase Cost and Uncertainty to Attackers
 Security allocation: (i) Target weights; (ii) Opponent reaction
 Stackelberg: Security forces commit first
 Optimal allocation: Weighted random

Strong Stackelberg Equilibrium
Attacker
Defender
Target #1
Target #2
Target #1
4, -3
-1, 1
Target #2
-5, 5
2, -1
3
Game Theory for Security
 Models adaptive adversaries

Attackers change strategies as the security policy changes
 Game models can be very expressive:

Information/intelligence about relative risk, vulnerability, consequence,
desirability, etc.
 Attackers using surveillance and insider knowledge
 Complex attack and defense strategies
 Realistic models of human behavior
 Uncertainty about model parameters
 Key research issues





Solving large scale games
Uncertainty, robustness
Human behavior
Learning, planning
……
4
Game Theory for Security: Applications
Game Theory + Optimization + Uncertainty + Learning + …
Infrastructure Security Games
Coast Guard
Green Security
Games
Opportunistic
Crime Games
Coast Guard: Ferry
Coast Guard
LAX
TSA
Cyber Security
Games
Panthera/WWF
LA Sheriff
USC
Argentina Airport
Chile Border
India
Global Presence of Security Games Efforts
Outline
Motivating real-world applications
Game theory and security game foundaction
Algorithms and some recent progress
Human behavior modeling & learning
Game theory for cyber security
Game theory for privacy
7
ARMOR: Deployed at LAX 2007
 “Assistant for Randomized Monitoring Over Routes”
 Problem
1: Schedule vehicle checkpoints
 Problem
2: Schedule canine patrols
 Randomized schedule: (i) target weights; (ii) surveillance
ARMOR-Checkpoints
ARMOR-K9
8
ARMOR Canine: Interface
9
Federal Air Marshals Service (FAMS)
Undercover, in-flight
law enforcement
International Flights from
Chicago O’Hare
Flights (each day)
~27,000 domestic flights
~2,000 international flights
Not enough air marshals:
Allocate air marshals to flights?
10
Federal Air Marshals Service (FAMS)

Massive scheduling problem

Adversary may exploit predictable schedules

Complex constraints: tours, duty hours, off-hours
100 flights, 10 officers:
1.7 ×1013 combinations
Overall problem: 30000
flights, 3000 officers
Our focus: international sector
11
IRIS: “Intelligent Randomization
in International Scheduling” (Deployed 2009)
12
PROTECT (Boston, NY and Beyond)

US Coast Guard: Port Resilience Operational / Tactical
Enforcement to Combat Terrorism

Randomized patrols; deployed in Boston, NY, etc

More realistic models of human behaviors
13
Protecting Moving Targets: Ferries

Protecting ferries with patrol boats

Staten Island Ferry: over 20 million people a year (60,000
passengers a day on weekdays)

Protecting refugee aid convoys with helicopters
14
Beyond Counterterrorism: LA Metro

LA Sheriff’s Dept (Crime suppression & ticketless travelers):
15
Beyond Counterterrorism: Other Domains

Customs and Border Protection

Cybersecurity

Environmental protection

Forest

Fish

Wildlife
Wildlife
Queen Elizabeth National
Park
Uganda
16
Normal Form Games (Strategic Form Games)
Problem/game representation:
 List of players, strategies, payoffs
 Simultaneous
 Zero-sum here but not necessary
Player B
Paper
Player A
Stone
Scissors
Paper
Stone
Scissors
0, 0
1,-1
-1,1
-1, 1
0, 0
1, -1
1, -1
-1, 1
0, 0
17
Solution Concepts
Nash Equilibrium
A (mixed) strategy for each
player such that no player benefits
from a unilateral deviation
Target1
Target 2
Target 1
1, -1
-1, 1
Target 2
-1, 1
1, -1
18
Stackelberg Game: Non-Simultaneous Moves
 What is the Nash equilibrium if it were a simultaneous move game?
 What if not simultaneous move game:

Alex [Leader] commits to strategy first

Bob [Follower] optimize against leader’s fixed strategy
Bob
Alex
Nash Equilibrium: <a,c>
c
d
a
2,1
4,0
b
1,0
3,2
What if leader (Alex) commits
to “b”
What will be Bob’s response?
Stackelberg Game: Non-Simultaneous Moves
 What if not simultaneous move game:

Alex [Leader] commits to strategy first

Bob [Follower] optimize against leader’s fixed strategy
Leader Commitment payoff: 3.5 > 2
Bob
Alex
Nash Equilibrium: <a,c>
c
d
a
2,1
4,0
b
1,0
3,2
Leader commits to uniform
random strategy {.5,.5}
Follower plays d:
Leader payoff: 3.5 > 2
First mover advantage in Stackelberg Games
 Leader can commit to mixed strategy
 Not play simultaneous move Nash equilibrium
 “Stong Stackelberg equilibrium”

Break ties in favor of the defender
 Leader’s payoff may improve over Nash
Heinrich Freiherr von Stackelberg
1905-1946
21
Security Games
 Two players

Defender

Attacker
 Set of targets: T
 Set of resources: R

Defender assigns resources to protect targets

Attacker chooses one target to attack
 Payoffs define the reward/penalty for each player for a successful or
unsuccessful attack on each target

Not always zero-sum

An attack on a defended target is better than an attack on the same target if it is
undefended (for the defender)
22
Stackelberg Equilibrium Formulations
8 roads
3 checkpoints
8 terminals
56 pure strategies for defender
8 pure strategies for attacker
Terminal
#1
Terminal
#2
…
Terminal
#8
Road
1,2,3
5, -3
2, -3
…
-6, 5
Road
1,2,4
5, -3
2, -3
…
-5, 5
-1, 1
…
…
…
…
Checkpoint at LAX
Road
6,7,8
3, -8
23
Algorithm 1: Multiple LPs
[Conitzer & Sandholm 2006]
 Solve for 1 adversary action at a time
Adversary’sPayoff
Payoff
Defender’s
Terminal
#2
…
Terminal
#8
Road
1,2,3
5, -3
2, -3
…
-6, 5
Road
1,2,4
5, -3
2, -3
…
-5, 5
-1, 1
…
…
Terminal
#1
…
…
Road
6,7,8
3, -8
24
Multiple LPs
 Solve for 1 adversary action at a time
Terminal
#2
…
Terminal
#8
Road
1,2,3
5, -3
2, -3
…
-6, 5
Road
1,2,4
5, -3
2, -3
…
-5, 5
-1, 1
…
…
Terminal
#1
…
…
Road
6,7,8
3, -8
25
Algorithm 2: DOBSS MILP
8 roads
3 checkpoints
8 terminals
56 pure strategies for defender
8 pure strategies for attacker
Terminal
#1
Terminal
#2
…
Terminal
#8
Road
1,2,3
5, -3
2, -3
…
-6, 5
Road
1,2,4
5, -3
2, -3
…
-5, 5
-1, 1
…
…
…
…
Checkpoint at LAX
Road
6,7,8
3, -8
26
Algorithm 2: DOBSS
Defender’s Expected Utility
Requires enumeration of
Defender’s Strategy
all pure strategies
Attacker’s Strategy
Payoffs
}
Stackelberg
Equilibrium
Constraints
q is a “best-response” to x
27
Efficient Algorithms
Challenges: combinatorial explosions due to:
 Defender strategies: Allocations of resources to targets

E.g. 100 flights, 10 FAMS
 Adversary types: Adversary strategy combination
 Attacker strategies: Attack paths
 E.g.
Multiple attack paths to targets in a city
 Basic approaches like Multiple-LPs do not scale
 Idea: exploit problem structure
 Compact
representations
 Techniques for large-scale optimization (e.g., strategy generation)
28
Scale Up in Number of Defender Strategies [2009]
Marginals of Mixed Strategies: Target Independence (IRIS)
max x ,q 
p
R
x
q
ij
i
Attack Attack Attackj
ARMOR: 10 flights, 3 air marshals Payoff duplicates: Depends
covered
l lon target
l
ARMOR
Actions
1
2
3
…
120
Flight
combos
Prob
1,2,3
1,2,4
1,2,5
…
8,9,10
x123
x124
x125
…
…
iX lL jQ
1
2
…
s.t1,2,3
.  x5,-10
1
i  1,  q …
Attack
6
l
4,-8 j
jQ
-20,9
1,2,4i
5,-10 4,-8 …
-20,9
l
l
l
0
( a 5,-10Cij-9,5
xi )  …
(1  q j-20,9
)M
1,3,5
iX
…
…
…
…
…

xi  [0...1], q  {0,1}
l
j
Compact
Action
Flight
Prob
1
2
3
…
10
1
2
3
…
10
y1
y2
y3
…
y10
“Marginals”: 10 variables in MIP:
y1 = x123+x124+x125…
y1+y2+y3…=3
Sample y (loses combination info)
29
Scale Up in Number of Defender Strategies [2009]
Marginals of Mixed Strategies: Target Independence (IRIS)
ARMOR: 10 flights, 3 air marshals Payoff duplicates: Depends on target covered
ARMOR
Flight
combos
Prob
1,2,3
1,2,4
1,2,5
…
8,9,10
x123
x124
x125
…
…
Compact
Action
Flight
Prob
1
2
3
…
10
1
2
3
…
10
y1
y2
y3
…
y10
Actions
1
2
3
…
120
Attack Attack Attack Attack
1
2
…
6
1,2,3
1,2,4
1,3,5
…
5,-10
5,-10
5,-10
…
4,-8
4,-8
-9,5
…
…
…
…
…
-20,9
-20,9
-20,9
…
“Marginals”: 10 variables in MIP:
Sample y (loses combination info)
Sampling difficult if constraints
on combinations; interacting tours
30
ORIGAMI
 Are there any cases of security games that can be solved in polynomial
time?
 Yes!
 Restricted class of security games:
 At
most one resource per target
 No
scheduling restrictions
 All
resources are identical
 For this class of security games, ORIGAMI is a polynomial algorithm
 Can be used as a heuristic in more complex algorithms
31
ORIGAMI Example
t1
t2
t3
t4
Cover Uncov Cover Uncov Cover Uncov Cover Uncov
Defender’s
utility
Attacker’s
utility
1
0
3
0
7
0
5
0
0
1
0
2
0
3
0
4
Four targets
One defender
32
ORIGAMI
Four targets
One resource
Zero Sum
Attacker payoffs
0
0
0
Uncovered
Covered
4
0
3
0
2
0
1
0
0
Coverage Probability
33
ORIGAMI
Attack Set:
Set of targets with
maximal expected
payoff for the
attacker
0
0
0
0
Coverage Probability
34
ORIGAMI
Observation 1
It never benefits
the defender to
add coverage outside the
attack set.
0
0
0.5
0
Coverage Probability
35
ORIGAMI
Compute coverage
necessary to make
attacker indifferent
between 3 and 4
0.25
0
0
0
Coverage Probability
36
ORIGAMI
Observation 2
It never benefits the
defender to add coverage to
a subset of the attack set.
0.5
0
0
0
Coverage Probability
37
ORIGAMI
0.5
0.33
0
0
Coverage Probability
38
ORIGAMI
Need more than one
resource.
0.75 0.66
0.5
0
Coverage Probability
39
ORIGAMI
Can still assign 0.17
0.5
0.33
0
0
Coverage Probability
40
ORIGAMI
Allocate all remaining
coverage to flights in the
attack set
Fixed ratio necessary for
indifference
0.54 0.38 0.08
0
Coverage Probability
41
Challenge
 Need to sample
 Sampling not necessarily possible / optimal in complex
domains
 Dealing with scheduling constraints
42
IRIS: Federal Air Marshals Service [2009]
Scale Up Number of Defender Strategies
Strategy 1
Strategy 2
Strategy 3
Strategy 1
Strategy 1
Stra
teg
y1
Strategy 2
Stra
teg
y2
Strategy 3
Stra
teg
y3
Strategy 4
Stra
teg
y4
Strategy 5
Stra
teg
y5
Strategy 6
Stra
teg
y6
 1000 Flights, 20 air marshals: 10

41
Strategy 2
Strategy 3
combinations
ARMOR out of memory
 Not enumerate all combinations:

Branch and price:
 Branch & bound + column generation
43
IRIS: Scale Up Number of Defender Strategies [2009]
Small Support Set for Mixed Strategies
Small support set size:
 Many xi variables zero
1000 flights, 20 air marshals:
41
10 combinations
max x ,q  Rij xi q j
Attack Attack Attack Attack
1
2
…
1001
iX jQ
s.t.  xi  1,  q j  1 x123=0.0
1,2,3.. 5,-10 4,-8
x124=0.239 1,2,4.. 5,-10 4,-8
1,3,5.. 5,-10 -9,5
Cij xi )  (1 x135=0.0
q j )M
jQ
i
0  (a  
iX
x378=0.123
xi  [0...1], q  {0,1}
j
…
…
…
…
…
-20,9
-20,9
-20,9
1041 rows
44
IRIS: Incremental Strategy or Column Generation
Exploit Small Support
Master
Slave (LP Duality Theory)
Attack 1 Attack 2 Attack… Attack 6
1,2,4 5,-10
4,-8
…
-20,9
Best new pure strategy:
Minimum cost network flow
Target 3
Attack 1 Attack 2 Attack… Attack 6
1,2,4 5,-10
4,-8
…
-20,9
3,7,8 -8, 10
-8,10
…
-8,10
Target 7
Resource
Sink
…
…
Converge
Attack 1 Attack 2 Attack
1,2,4
3,7,8
…
5,-10
-8, 10
Attack 6
rows -20,9
4,-8500 …
41
-8,10
…
NOT 10 -8,10
45
Strategy Generation Revisited
Fully connected Graph
20 intersections, 190 roads
5 resources, 1 target
~ 2 billion defender allocations
6.6 quintillion (1018) attacker paths
Real Problem:
~30,000 intersections
~70,000 roads
46
Strategy Generation for Both Players
Master
Slave:
Defender
Slave:
Attacker
47
Strategy Generation for Both Players
Properties:
Master
■ NP-Hard
Slave:
Defender
■ Sub-modular
Algorithms:
■ Optimal MILP
■ Heuristic algorithm
Slave:
Attacker
48
Strategy Generation for Both Players
Properties:
Master
■ NP-Hard
Slave:
Defender
■ Sub-modular
Algorithms:
■ Optimal MILP
■ Heuristic algorithm
Slave:
Attacker
Reduction from Set-Cover
49
Strategy Generation for Both Players
Properties:
Master
■ NP-Hard
Slave:
Defender
■ Sub-modular
Algorithms:
■ Optimal MILP
■ Heuristic algorithm
Slave:
Attacker
Optimal MILP
50
Strategy Generation for Both Players
Properties:
Master
■ NP-Hard
Slave:
Defender
■ Sub-modular
Algorithms:
■ Optimal MILP
■ Heuristic algorithm
Slave:
Attacker
Sub-modularity
51
Strategy Generation for Both Players
Properties:
Master
■ NP-Hard
Slave:
Defender
■ Sub-modular
Algorithms:
■ Optimal MILP
■ Heuristic algorithm
Slave:
Attacker
Greedy Heuristic
52
Strategy Generation for Both Players
Heuristic Slave:
Defender
Master
Useful: No
Useful: Yes
Slave:
Defender
Slave: Attacker
53
Summary of Insights
 Compact representation of security games
 Large-scale optimization techniques for scaling up # actions
 Column
/ cut generation
subproblem: network flow
 Compact representation of strategy space as marginals
54
Some Recent Progress (2015-)
 Dynamic resource allocation [IJCAI’15]
 Optimally monitoring potential terrorists [AAAI’16a]
 Coalitional security games [AAMAS’16]
 Protection externality [AAAI’15,17a]
 Interdict illegal network flow [IJCAI’16a,17a]
 Strategic Secrecy in Security Games [IJCAI’17b]
 Adversarial Machine Learning [IJCAI’17c]
 Mitigate sequential spear phishing [AAAI’16b]
 Protect elections [IJCAI’16b]
 Protect coral reef ecosystems [IJCAI’16c]
 Optimal defense against man-in-the-middle attack [AAAI’17b, IJCAI’17e]
 Efficient container inspection [AAMAS’17]
55
Protecting Large Public Events [AAAI’14]
 Target value changes over time

Utility of attacking a target decreases with # of protecting resources
 Idea: Dynamically allocate security resources


Transfer resources at any time
A resource in transfer is not protecting any target
3
4
2
1
Boston Marathon Bombings
Varying target value v (t)
i
56
SCOUT-A:Negligible Transfer Time
 Context: Resources can be transferred quickly
 Find the minimax assignment of resources at each time point
Example: 2 targets (T1, T2), 2 resources
Target value
Infeasible
find the
minimax assignment at
ResourcestoAttacker
utility
each time
(0) time is continuous
0 pointv1since
v (t)
v1 (0) / e
T1
l
v1(t)
2
v1 (0) / e2 l
0
T2
v2 (0)
v2 (0) / el
v2 (0) / e2 l
Minimax assignment at time 0 0
1
2
3
t-time
4
5
57
SCOUT-A:Negligible Transfer Time (2)
 ‘Minimax assignment’ does not change continuously
Attacker Utility
v2(t)
0
T1
T1
v2(t) / eλ
T2
SCOUT-A computes the time point at which
v2(t) / ea2λ minimax
assignment ‘expires’, then finds the ‘next’
v1(t)minimax
0
assignment
v (t) / eλ
T2
T1
1
T2
v1(t) / e2λ
0
1 2 3 4
t-time
5
58
SCOUT-C: Nonzero Transfer Time
Key Theorem
 For
any game with continuous defender strategy space, we can
construct an equivalent game with discrete defender strategy space
Initial Game
te
0
Transfer at any time
Constructed Game
 
 t
0
e

Transfer at discretized
points
The equilibrium of the constructed game is
also an equilibrium of the initial game
59
Mixed Strategy EQ for Dynamic Payoff Games
[IJCAI’15]
 Dynamic Payoffs: Targets’ importance changes
Importance
Bar street
 Mixed strategy

Museum
Time
Distribution over pure strategies (Infinite & Continuous support set)
 Mixed strategy

Compact representation
Coverage functions ci (t): Probability of protecting target i at time t
Pure strategy 1 (0.5)
Target 1: 6:00 – 12:00
Target 2: 12:00 – 18:00
Pure strategy 2 (0.5)
Target 1: 6:00 – 10:00
Target 2: 10:00 – 18:00
1, t  [6 : 00,10 : 00)
c1 (t )  0.5, t  [10 : 00,12 : 00)
0, t  [12 : 00,18 : 00]
Theorem:
Coverage functions can be implemented by sampling pure
strategies with finite transfers
60
ADEN: Nonzero Transfer Time
 A resource may be in transfer at time t (COCO cannot work)
 Approach: Discretize game by an (e ,d ) - Mesh
vi (t )
v1 (t )
tk
vi (t )
£e
d
t k+1
v1 (tk )
t k+1
tk
Theorem: Discretizing game by (e ,d ) - Mesh leads to a loss of at most ε
Weight of point (k,i)
target
: stay
: : transfer
s
Value of target i at time tk
in the bridge game v'i (t k )
Theorem: A mixed strategy
t
i
1
k
k+1 k+aij T
An m-unit fractional flow
Complexity: O( te n 2 ) (FPTAS for Lipschitz continuous value functions )
61
Detecting Terrorist Plots [AAAI’16a]

Paris Shootings on January 7, 2015

Two Kouachi brothers stormed into the Paris office of Charlie Hebdo and
gunned down 12 people

A few hours later, Amedy Coulibaly killed a policewoman in Montrouge and Chérif Kouachi
four hostages at a kosher supermarket in east Paris

A coordinated plot planned by al-Qaeda in the Arabian Peninsula (AQAP)
Saïd Kouachi
 Monitoring potential terrorists!
 Terrorist planners (e.g., ISIS):


Arouse a connected subgroup terrorist network

Conduct surveillance
Charlie Hebdo
Amedy Coulibaly
Security agencies (e.g., DGSE)

Decide how to allocate limited security resources to monitor the terrorists
62
21
TPD: Utility Function and Equilibrium
 Zero-sum Game:

If overlap, defender wins:
Example:
 If not, attacker succeeds:
Example:
3
1
Ua  Ud  0
U a  P( A42 ), U d   P( A42 )
2
4
6
 Attacker utility of choosing subgraph A:
5
P(A) = å
vÎA
(t v + d å
uÎN Av
tu)
neighborhood of v in subgraph A
the extent of positive network externality
63
DO-TPD: Overview
Start from a small strategy space<S’,A’>
Solve the restricted game LP(S’,A’)
Defender Oracle
Attacker Oracle
If find no solution
bestO-A
If find no solution
betterO-A
betterO-D
Find a new strategy set A+ to improve the solution
S’=S’ U S+, A’=A’ U A+
Yes
bestO-D
Find a new strategy set S+ to improve the solution
A+ or
S+
found?
No
Global optimality
reached!
END
Theorems:
•
•
BestO-D/bestO-A is NP-hard;
BetterO-A/betterO-D guarantees a (1-1/e)-approximation ratio
64
Coalitional Security Games [AAMAS’16]
al-Qaeda
Other terrorist
groups
“The interconnected nature of terrorist organizations necessitates
that we pursue them across the geographic spectrum to ensure that all
linkages between the strong and the weak organizations are broken,
leaving each of them isolated, exposed, and vulnerable to defeat.”
-CIA
National Strategy for Combating Terrorism
65
CSG: Strategies, Utility and Equilibrium
 Strategies

Defender: blocking a set of edges

Attacker: playing coalitional game
and forming a coalitional structure
𝐶2𝐶2
 Utility

Coalition value: capability of
attacking targets (knapsack problem)
𝑣(𝐶1 ) = 20
𝑣(𝐶2 ) = 10
𝐶1
blocking costs: 10
Value: 10
Value: 20
66
CSG: Branch and Price Algorithm
Column generation
Solving large-scale LP relaxation
UB
branching
LB≥UB or
𝑩∗ integral
𝑩𝟏𝟐 =0
Master Problem
LB<UB & 𝑩∗
fractional
LP, solve to optimality
LP RELAXATION
𝑩𝟏𝟐 =1
Interior Point Stabilization
LPs, get interior dual solution
pruning
Theorems:
Slave problem
𝑩𝟏𝟑 =0
𝑩𝟏𝟑 =1
• CSG problem is MAX SNP-hard Bilevel MILP, optimal value: 𝒓∗
• Our algorithm achieves constant factor approximation
Branch and bound
Solving integer program
LP Relaxation
Greedy
Single MILP
Polynomial-time
ADD
COLUMN
∗
GE𝐓 𝑩 & LB
𝒓∗ <0
NO
YES
67
Security Games with Protection Externalities [AAAI’15]
 Protection Externalities (PE)

One resource protects multiple targets simultaneously
 Many real-world scenarios: ferry ship…
 NP-hard to compute the equilibrium

Reduction from Set Cover
 A Column Generation based solution algorithm:
SPE
① 𝑁 target-defined LPs (t-LP)
A MILP formulation for the slave procedure

A constant-factor greedy approach to speed up
③An upper bound LP for pruning (u-LP)
1
2
Pruning
u-LP 1
u-LP 2
t-LP 1
t-LP 2
CG
CG
u-LP 𝑁
…

3
…
②Column Generation for t-LP
t-LP 𝑁
CG
max t−LP 𝑖
𝑖
68
SPE on a Plane [AAAI’17a]
 A more realistic setting


A planar topology
Resource allocation in a continuous space
(not restricted to targets)
 Easier? Harder?

NP-hard 
Reduction from Euclidian Disc Cover: if 𝑛 given points on a plane can be
covered by 𝑚 identical disks.

Inapproximable: no PTAS unless P=NP 

Approximable for zero-sum games: a PTAS 
69
A PTAS based on Grid Shifting
 Divide the plane with a grid 𝒢1
 Shift 𝒢1 to obtain grids 𝒢2 , … , 𝒢l
 A new pure strategy space S in which each pure strategy
fits into one grid (not overlapping grid lines)
 Compute x ∗ , the optimal defender strategy under S
 (Polynomial time computable: ellipsoid method +
Dynamic programming for the separation oracle)
2
 U(x ∗ ) is a 1 − -approximation to the optimal solution
l
under the original strategy space
None zero-sum games
 Inapproximable in general but solutions with guaranteed quality can be obtained
efficiently if:


The game is quasi-zero-sum (as is in most real scenarios)
Solutions are restricted to robust ones
70
Network Flow Interdiction Games [IJCAI’16]
 Drug Smuggling
Illegal drug trafficking is a world wide issue
 Checkpoints are operated to prevent drug trafficking

 Challenges
Limited security resources
 Strategic smugglers
 Exponential-sized strategy spaces

 Solutions

Network flow interdiction game model

Column and constraint generation algorithm
12
a
b
16
s
20
10
4
9
t
7
13
4
c
14
d
71
Repeated Network Interdiction Problems [IJCAI’17a]
 Limitation of Existing Models
Human adversaries are not fully rational
 Defender has few prior knowledge of adversary
 Interactions between agents are frequent

 Repeated Network Interdiction Game
 Online Learning Framework
6 units
interdicted
12
a 12
b
16
s
10
13
4
c
20
9
7
14
d
t
5 units
4interdicted
4
Transformation to Online Linear Optimization
exploration
estimating the
adversarial flow
with unbiased
estimator
exploitation
applying FPL on
the estimated
adversarial flow
sequence
SBGA
algorithm
72
SBGA
Input:
Estimated flow sequence
12
6 units
12
a
1 2
𝑡−1
b
interdicted
𝒇 , 𝒇 16, … , 𝒇
 Exploration

Unbiased estimator
s
 Exploitation

10
4
20
9
7
Perturbed by a random
noise vector
13
c 14 d
𝒛
Follow the perturbed leader (FPL)
t
5 units
4interdicted
4
 Theoretical Guarantees
𝑡
Output: 𝒘𝑡 =
6
0
0.5
1
𝑡−1 𝑓1
𝑎𝑟𝑔 max 𝒘 ∙ (𝒇 + ⋯ + 𝒇
+ 𝒛)
Theorem:
𝒘
With proper learning parameters, we have: 𝑅𝑇 𝑆𝐵𝐺𝐴 = 𝒪(𝑇 2/3 ). 𝑡
0.25 0.5 𝑓2
5
Greedy algorithm
Convex
1 − 1/𝑒
Theorem:
𝑡
†
𝑡optimization
𝒓
𝒇
approximation
With proper learning parameters, we have:
=
𝑊
2
3
𝛿 ∙ 𝑂𝑃𝑇 − 𝑅𝑒𝑤𝑎𝑟𝑑(𝑆𝐵𝐺𝐴) ≤ 𝒪 𝑇 , where 𝑂𝑃𝑇 is the utility of
the optimal adaptive defender policy and 𝛿 is a constant depending
on the interdiction probabilities.
73
Strategic Secrecy in Security Games [IJCAI’17b]
 Dilemma of Plainclothes

commitment
strategic secrecy
How to explain the frequency use
of plainclothes in practice?
 Strategic Secrecy vs. Commitment
deceptive
transparent, defender’s private
information is revealed
 Perfect Bayesian Equilibrium (PBE)
Nash, simultaneous move
Stackelberg, leader’s advantage
 Computing PBE
 PBE vs SSE
Compute PBEs
with special
structures
zero sum
PBE
SSE
general sum
Support set
enumeration
Theorem: For zero-sum
games, PBE is no worse
than SSE.
Theorem: For certain payoff
structures, there exists PBE
Mixed Integer
no better than Linear
SSE.
Programming
74
Some Other Recent Works
Cyber Security [AAAI’16b,17b, IJCAI’17e]
Label Contamination Attack [IJCAI’17c]
Protect Elections [IJCAI’16b]
Coral Reef [IJCAI’16c], Container [AAMAS’17]
75
Cyber Security for Smart Traffic Control
 Cyber Attacks

E.g., hack the sensors and send fake data to the controller,
remotely hack into the controller and take control.

Example from Hollywood movies (e.g., The Italian Job)

Real-world example
 Israeli students attacked Waze APP with fake traffic jam in 2014
 STRICT: Secure TRaffIc ConTrol

Identification, defence

Complex system, heterogeneous interaction, dynamic
A game theoretic defense mechanism
against data poisoning attacks
 Attacker can poison sensor data
 A verification based online defense
mechanism
Optimal escape interdiction on
transportation networks [IJCAI’17d]
 A defender-attacker security game model
 Dynamically relocate security resources
(e.g., police cars) to interdict attacker
76

Download Report

Uncertainty and Dynamics - Nanyang Technological University

Paperzz.com

Your Paperzz