MOBIHOC05-netintegrity - Network Research Lab

A Secure Ad-hoc Routing Approach
using Localized Self-healing Communities
Jiejun Kong, *Xiaoyan Hong, Yunjung Yi,
Joon-Sang Park, *Jun Liu,
Mario Gerla
WAM Laboratory
Computer Science Department
University of California, Los Angeles
{jkong,yjyi,jspark,gerla}@cs.ucla.edu
*Computer
Science Department
University of Alabama, Tuscaloosa
{jliu,hxy}@cs.ua.edu
Problem Statement

RREQ flooding attack by non-cooperative members
(selfish or intruded member nodes)

Direct RREQ floods
– Non-cooperative members continuously generate RREQ
– RREQ rate limited & packet suppression needed

Indirect RREQ floods
– RREP & DATA packet loss
• Caused by rushing attack etc. [Hu et al.,WiSe’03]
– Indirectly trigger more RREQ floods
• Don’t blame the RREQ initiator

Excessive floods deplete network resource
Indirect Attack Example
RREQ
source
dest
RREP

RREQ forwarding
– Rushing attackers disobey delay (MAC/routing/queuing) requirements
& w/ higher prob., are placed on RREP / DATA path
– Can trigger more RREQ floods initiated by other good nodes

RREP & DATA packet loss is common in MANET
– Hard to differentiate attackers from non-attackers;
network dynamics? non-cooperative behaviors?
Outline

Related work

Community-based secure routing approach
– Strictly localized
– “Self-healing community” substitutes “single node”

Our analytic model
– Asymptotic network security model
– Stochastic model for mobile networks

Empirical simulation verification

Summary
Related Secure Routing Approaches

Cryptographic protections [TESLA in Ariadne, PKI in ARAN]
– Cannot stop non-cooperative network members;
They have required credentials / keys

Network-based protections
– Straight-forward RREQ rate limit [DSR, AODV]
• Long RREQ interval causes non-trivial routing performance
degradation
– Multi-path secure routing [Awerbuch,WiSe’02] [Haas,WiSe’03]
• Not localized, incurs global overhead, expensive
• Node-disjoint multi-path preferred, but challenging
– Rushing Attack Prevention (RAP) [Hu,WiSe’03]
• RREQ forwarding delayed and randomized to counter rushing
• Causes large route acquisition delay; less likely to find optimal
path
Our design

Goal: minimize # of allowed RREQ floods
– Ideally, 1 initial on-demand RREQ flood for each
e2e connection
– Maintain comparable routing performance

Solution:
– Build multi-node communities to counter noncooperative packet loss
– Design applies to wide range of ad hoc routing
protocols & various ad hoc networks
Community: 2-hop scenario
Community


Area defined by intersection of 3 consecutive transmissions
Node redundancy is common in MANET
– Not unusually high, need 1 “good” node inside the community area

Community leadership is determined by contribution
– Leader steps down (being taken over)
if not doing its job (doesn’t forward within a timeout Tforw)
Community: multi-hop scenario
Communities
source

dest
The concept of “self-healing community” is
applicable to multi-hop routing
Community Based Security (CBS)



End-to-end communication between ad hoc terminals
Community-to-community forwarding (not node-to-node)
Challenge: adversary knows CBS prior to its attack
– It would prevent the network from forming communities
– Network mobility etc. will disrupt CBS
On demand initial config

Communities formed during RREP
– Simple heuristics: promiscuously overheard 3
consecutive (ACKs of) RREP packets
 set community membership flag for the
connection

Goal revisited: reduce the need of RREQ
floods
– In spite of non-cooperative behavior
On demand initial config around V
Community around V
formed upon hearing RREP
RREQ
V1
U
V
E
upstream
V2
RREP EV

(Potentially non-cooperative) V’s
community must be formed at RREP
– Else V drops RREP and succeeds
– V1 and V2 need to know V’s “upstream”
ACK-based config
Communities (if C forwards a correct RREP)
C”
D
B
C
source
E
dest
C’
Communities
(C’ and C” not in transmission range & C’ wins)
Proactive re-config

Each community loses shape due to network
dynamics (mobility etc.)

End-to-end proactive probing to maintain the shape
– PROBE unicast + take-over
– PROBE_REP unicast + take-over
– Just like RREP

Again: reduce the need of RREQ floods
– In spite of random mobility & non-cooperative behavior
Re-config: 2-hop scenario
Old community becomes stale
due to random node mobility etc.
PROBE
oldF
S
D
X
no ACK
newF
Newly re-configured community
Node D's roaming trace
(PROBE, upstream, …)
(PROBE_REP, hop_count, …)
Re-config: multi-hop scenario
PROBE
source

PROBE_REP
X nodest
ACK
Optimization
– Probing message can be piggybacked in data packets
– Probing interval Tprobe adapted on network dynamics
Simple heuristics: Slow Increase Fast Decrease
Control flow & Data flow

Control flows’ job
– Config communities: RREP
– Reconfig communities: PROBE, PROBE_REP
(& data packets piggybacked with probe info)
– Unicast + take-over

DATA
– DATA packets
– Unicast + make-up (not take-over)
[community setup unchanged]
Outline

Other countermeasures

Community-based routing approach
– Strictly localized w/ clearly-defined per-hop operation
– “Self-healing community” substitutes “single node”

Our analytic model
– Asymptotic network security model
– Stochastic model for mobile networks

Empirical simulation verification

Summary
Notion: Security as a “landslide” game

Played by the guard and the adversary
– Proposal can be found as early as Shannon’s 1949 paper
– Not a 50%-50% chance game, which is too good for the
adversary

The notion has been used in modern crypto since
1970s
–
–
–
–
Based on NP-complexity
The guard wins the game with 1 - negligible probability
The adversary wins the game with negligible probability
The asymptotic notion of “negligible” applies to one-way
function (encryption, one-way hash), pseudorandom
generator, zero-knowledge proof, ……
AND this time ……
Our Asymptotic Network Security Model


Concept: the probability of security breach decreases
exponentially toward 0 when network metric increases
linearly / polynomially
Consistent with computational cryptography’s asymptotic
notion of “negligible / sub-polynomial”
Definition: A function m: N  R is negligible, if for every
positive integer c and all sufficiently large x’s (i.e., there
exists Nc>0, for all x>Nc),

is negligible by definition
x is key length in computational crypto
x is network metric (e.g., # of nodes) in network security
Probability of security breach
The Asymptotic Cryptography Model
The “negligible” line
(sub-polynomial line)
Insecure
1 2
•
(Ambiguous area)
# of key bits (key length)
Secure
128
See Lenstra’s analysis for proper key length
 Security can be achieved by a polynomial-bounded guard
(given adversary’s brute-force computational power)
polynomial-bounded
• against
There aare
approximately 2268adversary
atoms in the entire universe
Probability of network security breach
Our Asymptotic Network Security Model

The “negligible” line
(sub-polynomial line)
Insecure
Secureline
The “exponential”
(Ambiguous(memory-less
area)
line)
Network metric (e.g., # of nodes -- network scale)
Conforming to the classic notion of security used in modern
cryptography ! We’ve used the same security notion
Mobile network model

Divides the network into large number n of very
small tiles (i.e., possible “positions”)
– A node’s presence probability p at each tile is small
 Follows a spatial bionomial distribution B(n,p)
– When n is large and p is small, B(n,p) is approximately a
spatial Poisson distribution with rate r1
– If there are N mobile nodes roaming i.i.d.
rN = N·r1
– The probability of exactly k nodes in an area A’
r1 in Random Way Point model
[Bettstetter et al.]
a=1000
Community area Aheal
C
A
B

A
B
C
(left) maximal community
– 2-hop RREP nodes are (1 + e)·R away
– Area approaching

(right) minimal community
– 2-hop RREP nodes are (2 - e)·R away
– Area approaching 0

Real world scenarios randomly distribute between these two
extremes
Modeling adversarial presence

q : percentage of non-cooperative network
members (e.g., probability of node selfishness & intrusion)
 3 random variables
–x :
number of nodes in the forwarding community area
–y :
number of cooperative nodes
–z :
number of non-cooperative nodes
Effectiveness of CBS routing

Per-hop failure prob. of community-to-community
routing is negligible with respect to network scale N

Per-hop success prob. of node-to-node ad hoc
routing schemes is negligible (under rushing attack)

Tremendous gain EG := 1 / negligible approaching +1
Community Based Security
PPregular
community
q
N

N
q
In summary, in mobile networks haunted by
non-cooperative behavior, communitybased security has tremendous (
gain (
)
)
QualNet simulation verification

Perfermance metrics
– Data delivery fraction, end-to-end latency, control
overhead
– # of RREQ

x-axis parameters
– Non-cooperative ratio q
– Mobility (Random Way Point Model, speed min=max)

Protocol comparison
– AODV: standard AODV
– RAP-AODV: Rushing Attack Prevention (WiSe’03)
– CBS-AODV: Community Based Security
Performance Gap
%


CBS-AODV’s performance only drops slightly with more
non-cooperative behavior
Tremendous EG justifies the big gap between CBS-AODV
and others
Mobility’s impact
Less RREQ
%


In CBS-AODV, # of RREQ triggered is less sensitive to noncooperative ratio q
Enforcing RREQ rate limit is more practical in CBS-AODV
Summary

Conventional node-to-node routing is vulnerable to
routing disruptions
– Excessive but protocol-compliant RREQ floods
– Rushing attack + RREP / DATA packet loss
 The new community-to-community secure routing is
our answer
– Analytic study approves the community design
– Empirical simulation study justifies the analytic results
– General design

Open challenges
– More optimal estimation of forwarding window Tforw & probing
interval Tprobe
– Secure and efficient key management between two communities
This slide is intentionally left blank

Backup slides follow
r1

Inspired by Bettstetter et al.’s work
– For any mobility model (random walk, random way point),
Bettstetter et al. have shown that
r1 is computable following
– For example, in random way point model
in a square network area of size a£a defined by -a/2·x· a/2
and -a/2·y· a/2
– r1 is “location dependent”, yet computable in NS2 &
QualNet given any area A’ (using finite element method)
Delivery fraction & Control overhead

CBS-AODV’s performance only drops slightly with
more non-cooperative behavior
 Tremendous EG justifies the big gap (of delivery
fraction & total control overhead) between CBS-AODV
and others
Latency

Route acquisition latency
monotonically increases with q

AODV’s avg. data packet
latency drops due to short
routes
Mobility’s impact

CBS’s have better delivery fraction
– CBS-AODV,cons_flood’s cost is too high
RREQ limit control


In CBS-AODV, # of RREQ triggered is less sensitive to noncooperative ratio q
Enforcing RREQ rate limit is more practical in CBS-AODV
Protocol Details

Packet format
– (RREQ, upstream_node, ……)
– (RREP, hop_count, ……)
– In DSR or AODV , some of the extra fields can be spared
Protocol Details

Unicast control packets & their ACKs
Protocol Details

Unicast control flows config/re-config communities
– RREP, PROBE, PROBE_REP packets & data packets piggybacked with probe info
– Unicast + take-over

Data flows
– DATA packets
– Unicast + make-up (not take-over)