Coexistence with Malicious Nodes: A Game Theoretic Approach

Coexistence with Malicious Nodes: A Game Theoretic Approach
Wenjing Wang†, Mainak Chatterjee† and Kevin Kwiat‡
† Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816
‡ Air Force Research Laboratory, Information Directorate, Rome, NY 13441
malice, a regular node has to listen to the channel and/or
process the information sent by the nodes being monitored.
The listening and processing consume resources of a regular
node. Hence, an “always on” monitoring scheme is not
efficient even if plausible. Second, the malicious node can
disguise itself. To reduce the probability of being detected, a
malicious can behave like a regular node and choose longer
intervals to attack the network. Third, the randomness and
unreliability of the wireless channel bring more uncertainty
to the monitoring and detection process.
Nevertheless, identifying the malicious nodes is not the
end of the story. Although in most cases, a malicious node
is isolated as soon as it is detected, there might be situations
where malicious nodes can be kept and made use of. The
most straightforward reason for the coexistence is that a malicious node has no idea whether it has been identified or not,
and it will continue to operate like a regular node to avoid
detection. During this time, i.e., when the malicious node
cooperates in disguise, it can be exploited for normal network
operations. This “involuntary” help from the malicious node
may be valuable, especially when the network resource is
limited. As a matter of fact, from the perspective of the
malicious nodes, coexistence gives them a longer lifetime in
the network and the opportunity to launch future attacks. On
the contrary, the regular nodes have a criteria to evaluate the
benefit from the malicious nodes. The criteria also determine
when to terminate the coexistence and isolate the malicious
nodes.
Recently, much work has been done that investigates the
interactions between the regular and malicious nodes using
game theory. Kodialam et. al. formally propose a game
theoretic framework to model how a service provider detects
an intruder [8]. However, their assumptions of zero-sum
game and complete, perfect knowledge have limitations.
Agah et. al. study the non-zero-sum intrusion detection game
in [1]; their results infer the optimal strategies in one-stage
static game with complete information. In [13], Liu et.
al. propose a Bayesian hybrid detection approach to detect
intrusion in wireless ad hoc networks. They design an energy
efficient detection procedure while improving the overall
detection power. [12] models the intention and strategies of a
malicious attacker through an incentive-based approach. The
importance of the topology on the payoffs of the malicious
nodes are investigated in [19]. An interesting flee option for
the malicious node is proposed in [10]. In that analysis, a
malicious decides to flee when it believes it is too risky to
stay in the network. While the approach focuses on how the
Abstract— In this paper, we use game theory to study the
interactions between a malicious node and a regular node in
wireless networks with unreliable channels. Since the malicious
nodes do not reveal their identities to others, it is crucial
for the regular nodes to detect them through monitoring and
observation. We model the malicious node detection process
as a Bayesian game with imperfect information and show that
a mixed strategy perfect Bayesian Nash Equilibrium (also a
sequential equilibrium) is attainable. While the equilibrium in
the detection game ensures the identification of the malicious
nodes, we argue that it might not be profitable to isolate the
malicious nodes upon detection. As a matter of fact, malicious
nodes and regular nodes can co-exist as long as the destruction
they bring is less than the contribution they make. To show
how we can utilize the malicious nodes, a post-detection game
between the malicious and regular nodes is formalized. Solution
to this game shows the existence of a subgame perfect Nash
Equilibrium and the conditions that achieve the equilibrium.
Simulation results and their discussions are also provided to
illustrate the properties of the derived equilibria.
I. I NTRODUCTION
Game theory[5], [17] has been successfully applied to
solve various problems in wireless networks including cooperation enforcement [3], [4], [7], [15], [18], routing protocols
[6], [16], [20], [21] and other system design issues [11],
[14]. The underlying assumption in most of the existing
approaches regard the entities in the network (also called
nodes) as selfish and rational. The selfish nodes, governed by
their utility functions, only care about their own payoffs and
choose corresponding strategies to maximize them. Usually,
the payoffs are the benefits a node can derive from other
nodes or the network. However, it is possible that there are
some nodes whose objective is to cause harm and disorder to
the network. These nodes, referred as malicious nodes, do not
reveal their identities while disrupting network services. The
objective of the malicious nodes is to maximize the damage
before they are detected and isolated. They are also rational,
and their payoff is determined by the amount of damage they
cause to the network.
In order to minimize the impact of the malicious nodes, detection mechanism needs to be in place. Thus, a regular node
should monitor its surroundings and distinguish a malicious
node from a regular one. However, the detection process has
challenges. First, monitoring can be costly. To identify the
Emails: {wenjing, mainak}@eecs.ucf.edu, [email protected]. This
research was sponsored by the Air Force Office of Scientific Research
(AFOSR) under the federal grant no. FA9550-07-1-0023 and AT&T Graduate Fellowship in Modeling and Simulation. Approved for Public Release;
distribution unlimited; 88ABW-2008-1164, 02DEC08.
978-1-4244-4177-8/09/$25.00 ©2009 IEEE
277
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
Despite the two types of nodes, the identity (type) of a
malicious node is not directly revealed to others. Instead,
the types can only be estimated or conjectured through
observing actions. To identify the attacks and malicious
nodes in the network, a regular node can monitor the actions
of others. However, such monitoring is costly (e.g., consumes
the receivers’ own resource) and a node cannot afford to
monitor all the time. Moreover, the observations might not
be accurate because of the noise, e.g., wireless channel loss.
Thus, the regular nodes do not monitor the network all the
time and during those times, attacks cannot be identified.
To simplify the analysis, our research focuses on the
packet forwarding process. We assume that node i, or the
sender node, has a packet to send to node j, or the receiver
node. If the sender node is regular, it only takes the action
“Forward”. If the sender node is malicious, it can choose to
“Attack” with a risk of being identified or “Forward” (not
attack) to disguise. We further assume that time is divided
into slots and nodes take their actions within each slot.
flee action affects the result of the game, it does not consider
the noise in observation.
Our focus in this research is to use game theory to model
and analyze the interactions between a malicious node and
a regular node. In particular, we formalize the interactions
into two cascaded games. The first game, namely malicious
node detection game, is a Bayesian game with imperfect
information. The information is hidden because the malicious
node can disguise as a regular node and the actions are
hidden due to the noise and imperfect observation. The
second game, called post-detection game, is played when
the regular node knows confidently that its opponent is a
malicious node. In the latter game, the regular node observes
and evaluates the actions of the malicious node, and decides
whether to keep it or isolate it. For both games, we show
the existence of equilibria and derive the conditions that
achieve them. We also provide simulation study to support
the efficiency of the equilibria. The main contributions in
this paper can be categorized into two parts.
• We model the malicious node detection game under
unreliable channels as a Bayesian game with imperfect
monitoring and show a mixed strategy perfect Bayesian
Nash Equilibrium is attainable. The strategy profile is
also shown to give a sequential equilibrium solution.
Results show how the equilibrium strategy profiles are
affected by parameters like channel noise, successful
attack rate, successful detection rate, attack gain, detection gain, false alarm rate and etc.
• We propose the notion of coexistence after detection
in order to utilize the malicious node. A coexistence
index is designed to evaluate the helpfulness of a
malicious node. We derive the conditions under which a
subgame perfect Nash Equilibrium is achieved. Through
simulation, we also show how the malicious node can
be used to improve the network throughput and extend
network lifetime.
The rest of the paper is organized as follows. In Section II,
we introduce and solve the Bayesian game of malicious node
detection. Section III presents the post-detection game and
discusses how malicious and regular nodes can coexist after
detection. Simulation results are presented in Section IV that
illustrate our findings. The last section concludes the paper.
B. Game Model
To abstract the interactions among the nodes, we consider
a two-player game played by the sender node i and the
receiver node j. The types of these nodes, θi and θj , are
private information. Since the type of each player is hidden,
and the observation is not accurate, it is a Bayesian game
with imperfect information [17].
To model the process of detecting the malicious nodes
in the network, we apply a special category of Bayesian
game called the signaling game. A signaling game is played
between a sender and a receiver. The sender has a certain
type and a set M of available messages to be sent. Based
on its knowledge on its own type, the sender chooses a
message from M and sends it to the receiver. However, the
receiver does not know the type of the sender and can only
observe the message but not the type. Through observation,
the receiver then takes an action in response to the message it
observed. In the malicious node detection game, the sender,
node i can be either regular θi = 0 or malicious θi = 1. The
receiver, node j is always a regular node, i.e., θj = 0.
The action profiles ai available to node i are based on
its type. For θi = 0, ai = {F orward}. For θi = 1,
ai = {Attack, F orward}. The receiver node j has the
option to monitor if node i is attacking or not, thus aj =
{M onitor, Idle}.
To further construct the game, we define the following
values. Let gA be the payoff of a malicious node if it
successfully attacks. The cost associated with such an attack
is cA . For the receiver node j, the cost of monitoring
is cM and 0 if it is idle. Hence, for the action profile
(ai , aj ) = (Attack, Idle), the net utility for a successful
attacking node i is gA − cA , the loss for node j is −gA due
to the attack. Similarly, if the action profile is (ai , aj ) =
(Attack, M onitor), the attacking malicious node i losses
gA + cA , and the net gain for node j is gA − cM . However, if
a malicious node chooses not to attack, the cost to forward
a packet is cF , which is the same cost to a regular sender
II. D ETECTING M ALICIOUS N ODES UNDER U NRELIABLE
C HANNELS
A. Network Model
We consider a wireless network consisting of a fixed
number of nodes. The types of the nodes can be either
Regular or Malicious. For a regular node, its actions are
rational and are governed by an underlying utility function. A
rational action may not be cooperative if such cooperation is
not profitable. A regular node is selfish (i.e., acts towards its
own interest); however, it never brings malice to the network.
On the other hand, a malicious node aims to hamper, disturb,
and even attack the network. Although the actions of a
malicious node is also determined by certain utility functions,
such functions are designed to bring damages to the network.
278
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
Case 1: σi = (Attack if θi = 1, F orward if θi = 0).
For node j, if σj = M onitor, the expected payoff is
uj (M onitor) =
(gA − cM )φ(1 − pe ) − φpe [γ(gA + cM )
+(1 − γ)cM ] − (1 − φ)cM
(1)
where each term represents monitoring the attack successfully, failing to monitor the attack, and node i is regular
respectively. If σj = Idle, the expected payoff is
uj (Idle) = −φγgA
Fig. 1.
If (2)>(1), the dominant strategy for node j is Idle.
Correspondingly, for node i, the best response would be
(Attack if θi = 1, F orward if θi = 0). Thus (σi , σj ) =
{(Attack if θi = 1, F orward if θi = 0), Idle} is a BNE
cM
under the condition that φ < (1−pe )(1+γ)g
. If (2)<(1), or
A
cM
φ > (1−pe )(1+γ)gA , the dominant strategy for node j is
M onitor, however, the best response to M onitor for node
i is F orward ∀ θi . Hence (σi , σj ) = {(Attack if θi =
1, F orward if θi = 0), M onitor} is not a BNE under the
cM
.
condition that φ > (1−pe )(1+γ)g
A
Case 2: (σi , σj ) = {(F orward ∀ θi , Idle}. If node
j chooses not to monitor, the best response for node i is
to Attack if θi = 1. This will lead to the previous case
cM
. Therefore, there is no BNE if
when φ < (1−pe )(1+γ)g
A
(σi , σj ) = {(F orward ∀ θi , Idle}.
To sum up, the pure strategy BNE exists if and only if φ <
cM
(1−pe )(1+γ)gA . The equilibrium strategy profile is (σi , σj ) =
{(Attack if θi = 1, F orward if θi = 0), Idle}. In other
cM
, such that no pure
words, we can find φ0 = (1−pe )(1+γ)g
A
strategy BNE exists if φ > φ0 .
Although pure strategy BNE exists, it is not practical
because the equilibrium requires node j to be Idle at all
times, and hence the malicious nodes cannot be detected. It
is also called Pooling Equilibrium [17] in which the receiver
has no clue about sender’s type. Therefore, it is desirable to
seek a mixed-strategy BNE, and obviously, such BNE exists
when φ > φ0 .
Let us denote p as the probability with which node i of
type θi = 1 plays Attack and q as the probability with
which node j plays M onitor. To find the mixed strategy
BNE of this game, we need to find the values of p and q such
that neither node i nor j can increase payoff by altering the
actions. For the mixed strategy played by node i, the payoff
of node j playing M onitor is
Stage malicious node detection game tree.
node. Based on the types of node i and node j, the payoffs
matrices are presented in Table I.
In addition, in our model, we introduce pe as the channel
loss rate. The channel unreliability implies that monitoring
can be accurate with probability 1 − pe . We also denote γ
as the attack success rate.
Node i
(a) θi = 1, malicious sender
Node j
Monitor
Idle
Attack
−gA − cA
gA − cM
gA − cA
Forward
−cF
−cM
− cF
Node i
(b) θi = 0, regular sender
Node j
Monitor
Idle
Forward
−cF
−cM
− cF
(2)
−gA
0
0
TABLE I
PAYOFF MATRIX OF TWO PLAYER MALICIOUS NODE DETECTION GAME .
C. Equilibrium analysis for the stage game
We begin our analysis on the malicious node detection
game from the extensive form of the static Bayesian game
as illustrated in Figure 1. We consider the type determination
of node i when θi = 1 happens with probability φ. To solve
this game, we are interested in finding the possible Bayesian
Nash Equilibrium (BNE). In a static Bayesian game, the BNE
is the Nash Equilibrium given the beliefs of both nodes. In
our case, node i knows for sure that for node j, θj = 0,
however, node j’s belief about node i is that θi = 1 with
probability φ.
First, let us consider pure strategies only. Based on
θi , the pure strategies available for node i are σi =
{(Attack if θi = 1, F orward if θi = 0), F orward ∀ θi }.
For node j, the strategy set is σj = {M onitor, Idle}. To
find the BNE, we let σi and σj play with each other and
derive the conditions under which neither node can increase
its utility by unilaterally changing its strategy.
L EMMA 1: In the malicious node detection game, there is
a malice belief threshold φ0 , such that no pure strategy BNE
exists if φ > φ0 .
Proof: We start by eliminating a trivial pure strategy pair
(F orward ∀ θi M onitor). From Table I(a), we know that
for both nodes, they can improve their payoffs by deviating
from the strategy pair. We further analyze the following two
cases.
uj (M onitor) =
φp[γ(gA − cM )(1 − pe ) + (1 − γ)(1 − pe )(gA − cM )
−(1 − γ)pe cM − γpe (gA + cM )]
−(1 − p)φcM − (1 − φ)cM
= φp[gA − gA pe (1 + γ)] − cM .
(3)
If node j plays Idle,
uj (Idle) = −pγφgA .
(4)
Thus, in the mixed BNE strategy, uj (M onitor) =
cM
. Similarly, when node j
uj (Idle). Thus p = φgA (1+γ)(1−p
e)
279
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
B(iii) is fulfilled because node i’s signal is determined
(k)
by its action and if ai (k) = âi (k), µj (θi |ai (k), hj ) =
(k)
µj (θi |âi (k), hj ). B(iv) is trivial in our game because no
third player exists.
The analysis on Bayesian conditions and sequential rationality serves as the proof of the following theorem.
T HEOREM 1: The dynamic malicious node detection
game has a perfect Bayesian equilibrium that can be attained
(k)
(k)
with strategy profile (σi , σj ) = (p(k) , q (k) ).
Remark 1: The infeasibility of pure strategy PBE is proved
as follows: If node i attacks, the best response for node
j is M onitor, which makes node i non-profitable to play
Attack. If node i plays F orward, p(k) = 0, the best
response for node j is Idle (i.e., q (k) = 0). However, the
γ−cA +cF
sequential rationality requires q (k) ≥ gAgA(1−p
, which
e )(1+γ)
leads to a contradiction. Therefore, no pure strategy PBE
exists in the dynamic malicious node detection game. It is
noted that the infeasibility of the pure strategy PBE in the
dynamic setttings should not be confused with the existence
of a pure strategy BNE in a static game because the pure
strategy BNE in a static game is always an artifact.
Remark 2: The proved PBE can be further refined to
Sequential Equilibrium [9]. In the sequential equilibrium,
the Bayesian conditions are extended as belief sensibility
and consistency. The belief sensibility requires the information sets can be reached with positive probabilities (µ)
given the strategy profile σ. The consistency demands an
assessment (σ, µ) should be a limit point of a sequence
of the mixed strategies and associated sensible beliefs, i.e.,
(σ, µ) = limn→∞ (σ n , µn ). In our game, belief sensibility
is satisfied because our proposed belief system updates the
beliefs according to Bayes’ rule and it assigns a positive
probability to each of the information set. Theorem 8.2 in
[5] states that in incomplete information multi-stage games,
if neither player has more than 2 types, Bayesian condition
is equivalent to belief consistency requirement. In our game,
θi = 0, 1, θj = 0, and hence consistency is fulfilled. Together
with the sequential rationality, the PBE in our game is also
a sequential equilibrium. Since every finite extensive-form
game has at least one sequential equilibrium, which is a
refinement to PBE, it also implies the existence of PBE in
our game.
We first show the existence of a mixed strategy equilibrium and then argue the infeasibility of the pure strategy
equilibrium. Consider at an arbitrary stage k of the game;
we denote p(k) as the probability node i of type θi = 1
plays Attack, q (k) as the probability node j plays M onitor.
(k)
(k)
In the equilibrium, ui (Attack) = ui (F orward) and
(k)
(k)
uj (M onitor) = ui (Idle). In particular,
(k)
(k)
ui (ai
(k)
= Attack|aj
= M onitor) =
−(gA + cA )(1 − pe )q (k) + (gA − cA )γ(1 − q (k) )
+(gA − cA )γq (k) pe − cA (1 − γ)pe q (k)
−cA (1 − q (k) )(1 − γ)
(k) (k)
(k)
ui (ai = F orward|aj = M onitor) =
(k) (k)
(k)
uj (aj = M onitor|ai = Attack) =
(k)
µj (θi = 1)p(k) [γ(gA − cM )(1 − pe )
−cF .
(13)
(14)
+(1 − γ)(1 − pe )(gA − cM ) − (1 − γ)pe cM
(k)
−γpe (gA + cM )] − (1 − p(k) )µj (θi = 1)cM
(k)
−µj (θi = 0)cM
(15)
(k) (k)
uj (aj
(k)
= Idle|ai = Attack)
(k)
−p(k) gA γµj (θi = 1).
=
(16)
The solutions to the above equations are
cM
p(k) =
(k)
µj (θi = 1)gA (1 + γ)(1 − pe )
g A γ − cA + cF
q (k) =
.
gA (1 − pe )(1 + γ)
(17)
(18)
What p(k) and q (k) suggest is an equilibrium profile
(k)
σj ). This profile shows the sequential rationality
[5], [17], that is, each node’s strategy is optimal whenever it
has to move, given its belief and the other node’s strategy.
′(k)
′(k)
In other words, for any alternative strategies σi and σj ,
(k)
(σi ,
(k)
(k)
(k)
′(k)
(k)
(k)
ui ((σi , σj )|θi , ai (t), µj (θi )) ≥
ui ((σi
(k)
(k)
, σj )|θi , ai (t), µj (θi ))
(k)
(k)
uj ((σi ,
(k)
(k)
uj ((σi ,
(k)
(k)
σj )|θi , âi (t), µj (θi )) ≥
′(k)
(k)
σj )|θi , âi (t), µj (θi ))
(19)
(20)
III. P OST- DETECTION G AME AND C OEXISTENCE
Besides sequential rationality, a PBE also demands that
the belief system satisfies the Bayesian conditions [5].
D EFINITION 1: ([5], p331-332) The Bayesian conditions
defined for PBE are
B(i): Posterior beliefs
are independent. For history
Q
h(t) ,µi (θi |θi , h(t) ) = j6=i µi (θj |h(t) ).
B(ii): Bayes’ rule is used to update beliefs whenever
possible.
B(iii): Nodes do not signal what they do not know.
B(iv): Posterior beliefs are consistent for all nodes with a
common joint distribution on θ given h(t) .
Our proposed belief system satisfies the Bayesian conditions. B(i) is satisfied because θj = 0 all the time. Eqn. (7)
is derived from Bayes’ rule, and hence B(ii) is also satisfied.
In the previous section, we have discussed how to update
node j’s belief system based on Bayes’ rule. It is natural
that through observation, although imperfect at every stage
game, node j can accumulate a better estimation about θi .
Eventually, after repeated monitoring, there will be a stage
at which node j can predict with confidence whether node i
is regular or malicious.
A. The post-detection game
Traditionally speaking, after node j has identified node i
as a malicious node, it will try to report and isolate node
i immediately to prevent future attacks. However, there are
also situations where “isolation” may not be a good choice.
281
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
(t)
Thus, the indifference condition require uj (M onitor) =
and hence p∗ is obtained as in (23) on next page.
Similarly, we can apply the indifference condition to node
i as:
Let us consider a wireless network which operates on a
limited resource budget. In order to prolong the lifetime of
the network, every regular node has to be economical towards
packet forwarding. Hence, if a malicious node can be used
to handle some of the traffic, it is beneficial not to isolate it.
However, there is a trade-off between how much benefit
a malicious node can bring and what damage it can do. We
denote nF and nA as the number of successful forwarding
actions and number of attacks taken by a malicious node.
Recall the cost of forwarding is cF and the loss due to an
attack to the network is gA . Thus, for a regular node, if it
observes that the total saving due to forwarding (nF cF ) a
malicious node contributes is greater than the total cost due
to its attack (nA gA ), then keeping that node in the network
is profitable.
To further analyze the conditions under which a malicious
node can be kept and coexist with the regular ones, we
formally define the post detection game. The game has
two players: node i and node j, both nodes know the
types of their opponent, i.e., node j knows that node i is
malicious but has not taken any action to isolate it. Thus,
θi = 1, θj = 0. The actions available for node i is ai =
{Attack, F orward}, while the actions for node j is aj =
{M onitor, Idle}. When node j monitors, it keeps a record
of what node i has done since the beginning of the game.
It also calculates a coexistence index Ci = n̂F cF − n̂A gA
for node i, where n̂F is the observed number of forwarding
actions and n̂A is the observed number of attacks. If Ci falls
under a certain threshold τ , node j will isolate node i and
terminate the post-detection game because keeping node i is
no longer beneficial. If Ci ≥ τ , the game will be played in
a repeated manner. The payoff matrix for the post-detection
game is the same as the detection game for θi = 1 as was
shown in Table I(a).
(t)
uj (Idle),
(t)
ui (Attack) = q ∗ {−(gA + cA )(1 − pe ) Pr(Ci < τ )
+ (1 − pe )[(gA − cA )γ − cA (1 − γ)] Pr(Ci ≥ τ )
+ (gA − cA )γpe − cA (1 − γ)pe }
− (1 − q ∗ )[cA (1 − γ) − (gA − cA )γ]
= q ∗ {−(gA + cA )(1 − pe ) Pr(Ci < τ )
+ (gA γ − cA )[(1 − pe ) Pr(Ci ≥ τ ) + pe ]}
+ (1 − q ∗ )(gA γ − cA ).
(t)
ui (F orward) = −cF .
Pr(n̂F = N̂F ) = ClN̂F [(1 − p∗ )(1 − pe )]N̂F
[1 − (1 − p∗ )(1 − pe )]l−N̂F (27)
Pr(n̂A = N̂A ) =
ClN̂A [p∗ (1 − pe )]N̂F [1 − p∗ (1 − pe )]l−N̂F (28)
Since y = c0 + n̂F cF − n̂A gA = c0 + n̂F cF −(l− n̂F )gA =
(cF + gA )n̂F − lgA + c0 and l, cF , gA , c0 are constants, to
get the distribution of y, we first get the distribution of w =
y + lgA − c0 .
We use the probability generation function (pgf). For
discrete random variable x, its pgf is defined as
Let us explore the strategies that both nodes can take to
reach the equilibrium of coexistence. To avoid confusion, we
denote p∗ and q ∗ as the probability node i plays Attack
and node j plays M onitor respectively. It is noted that
these probabilities are different from the ones we obtained
in Section II-D.
We first derive the Nash Equilibrium using indifference
conditions. Suppose the post-detection game is played at tth
repetition, i.e., subgame t. The expected payoff for player j
playing M onitor is
GX (z) = E[z X ] =
=
(1 − γ)(1 − pe )(gA − cM ) − (1 − γ)pe cM
γpe (gA + cM )]} Pr(Ci ≥ τ )
∗
∗
∗
(gA − cM )p (1 − pe ) Pr(Ci < τ ) − (1 − p )cM
[gA (1 − pe + γpe ) − cM ]p∗ Pr(Ci ≥ τ )
=
GW (z) = E[z W ] = E[z N̂F (cF +gA ) ]
l
X
n̂
{
z n Cl f [(1 − p∗ )(1 − pe )]n̂f
[1 − (1 − p∗ )(1 − pe )]l−n̂f }(cF +gA )
{(1 − p∗ )(1 − pe ) + [1 − (1 − p∗ )(1 − pe )]z}(cF +gA )l
(30)
(gA − cM )p (1 − pe ) Pr(Ci < τ ) − (1 − p )cM(21)
.
Let f (n) (x) =
If node j plays Idle, the expected payoff is always
(t)
(29)
n̂f =0
∗
uj (Idle) = −p∗ γgA .
z x Pr(X = x)
The pgf for y is
(t)
+
∞
X
x=0
uj (M onitor) = {p∗ [γ(gA − cM )(1 − pe )
+
=
(25)
Therefore, q ∗ can be expressed as (26) on next page.
The problem is then reduced to obtaining the probability
distribution of Ci . Let us assume at the beginning of the postdetection game Ci = c0 ≥ τ . For the sake of discussion, we
also assume that node j is constantly monitoring. Hence,
if we consider l subgames, in each of the subgame, Ci is
updated.
We denote a random variable y = Ci = c0 + n̂F cF − n̂A gA .
Since the mixed strategy profile requires node i to choose
Attack with probability p∗, n̂F and n̂A are binomially
distributed as:
B. Searching for a coexistence equilibrium
+
−
(24)
n
∂ f (x)
∂xn ,
(k)
P(w = k) =
(22)
282
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
GW (0)
k!
(31)
p∗ =
cM
.
[gA (1 − pe + γpe ) − cM ] Pr(Ci ≥ τ ) + (gA − cM )(1 − pe ) Pr(Ci < τ ) + cM + γgA
(23)
cA − g A γ − cF
−(gA + cA )(1 − pe ) Pr(Ci < τ ) + (gA γ − cA )(1 − pe )(Pr(Ci ≥ τ ) − 1)
(26)
q∗ =
The expected payoff for node j at this stage is the same as
above expect for the last probability term.
The probability terms in (23) and (26) are given by,
Pr(Ci ≥ τ ) = Pr(w ≥ lgA + τ − c0 )
X G(n) (0)
W
=
n!
(r)
Uj,dev,2
(32)
=
−[p∗ (1 − γ)pe + (1 − p∗ )]cM }
−(1 − q ∗ )p∗ γgA } Pr(Ci < τ )
n≥lgA +τ
(n)
Pr(Ci < τ ) = 1 −
X
n≥lgA +τ −c0
GW (0)
n!
(33)
=
T
X
Uj,dev =
T
X
(t)
Uj
(37)
t=r+1
(38)
or
gA γ[pe +1−Pr(Ci < τ )] ≥ pe [cM Pr(Ci ≥ τ )+γ−1−γcM ].
(39)
To sum up, for the equilibrium on the post-detection game,
we state the following theorem.
T HEOREM 2: The post-detection game has a mixed strategy Nash Equilibrium when node i attacks with p∗ and node
j monitors with q ∗ . This strategy is also subgame perfect if
gA γ[pe +1−Pr(Ci < τ )] ≥ pe [cM Pr(Ci ≥ τ )+γ−1−γcM ].
C. Convergence of the coexistence equilibrium
(34)
{q ∗ {p∗ (1 − pe )(gA − cM ) − p∗ γpe (gA + cM )
−[p∗ (1 − γ)pe + (1 − p∗ )]cM }
−(1 − q ∗ )p∗ γgA } Pr(Ci ≥ τ )
(r)
+q ∗ [γgA Pr(Ci < τ ) + pe cM Pr(Ci ≥ τ )]
Suppose node j deviates at rth stage and r ≤ T . The
deviation can be either of the following two cases.
Case 1: Isolate node i while Ci ≥ τ . In this case, if node j
attacks and is successfully observed, it will be isolated. The
expected payoff at this stage for node j is
=
(r)
gA γ(q ∗ pe + 1) + q ∗ pe (γcM + 1 − γ) ≥ (1 − q ∗ )γgA
t=0
(r)
(t)
Uj + Uj,dev,1 + Uj,dev,2 +
OSDP require Uj,dev ≤ Uj . After algebraic manipulation,
we have
+(gA − cM )p∗ (1 − pe ) Pr(Ci < τ )
Uj,dev,1
r−1
X
t=0
q ∗ {[gA (1 − pe + γpe ) − cM ]p∗ Pr(Ci ≥ τ )
−(1 − p∗ )cM } − (1 − q ∗ )p∗ γgA .
(36)
In this way, the total expected payoff for node j under
deviation is
To relax the assumption of node j’s constant monitoring,
the current stage t for the analysis is ⌈t = l/q ∗ ⌉. Therefore,
we have obtained the equilibrium strategy parameter p∗ and
q ∗ for every subgame.
So far, we have shown that for the mixed strategy profile,
attaining a Nash Equilibrium is feasible. As a matter of
fact, every game has a mixed strategy Nash Equilibrium.
To further refine the equilibrium, we apply the One-Shot
Deviation Property to derive the condition for subgame
perfect Nash Equilibrium. The property states:
D EFINITION 2: One-Shot Deviation Property (OSDP)
[17]: No player can increase her payoff by changing her
action at the start of any subgame in which she is the firstmover, given the other player’s strategies and the rest of her
own strategy.
We take node j as an example and assume the repeated
game has no discount. In our previous equilibrium analysis
using the indifference condition, we have proved that deviation from p∗ or q ∗ will not increase the payoffs. Hence, in
the following derivation, we show the deviation strategy is
related to Ci .
From (21) and (22), we can express the expected payoff
for node j as:
Uj
{q ∗ {p∗ (1 − pe )(gA − cM ) − p∗ γpe (gA + cM )
(35)
Case 2: Keep node i while Ci < τ . Since node j only
deviates one stage, node i will be isolated in the next stage.
The post-detection game described above ends when Ci <
τ . Since Pr(Ci < τ ) > 0, the game is of finite stages. In this
subsection, we try to derive the expected length (number of
stages) of the game.
We focus on the random variable Ci . As we mentioned
earlier, Ci = c0 + n̂F cF − nˆA gA . Again, we assume node j is
constantly monitoring. After one stage game, the probability
of n̂F = n̂F + 1 is (1 − p∗ )(1 − pe ), and the probability of
n̂A = n̂A + 1 is p∗ (1 − pe ). Thus, we model the evolution of
Ci as a random process similar to a 1-dimensional random
walk, where the value of Ci increases by cF with probability
(1−p∗ )(1−pe ), and decreases by gA with probability p∗ (1−
pe ). The 1 − pe term comes from the unreliability of the
channel. To obtain the expected length of the post-detection
game, it is equivalent to calculating the expected first hitting
time of the random process with the absorbing boundary
Ci = τ .
T HEOREM 3: The
expect
length
of
the
post-detection
game
is
(c −τ )/cF +d
)/cF +d
Pn̂F −1 (c0 −τ
η− 0
+d
−d
η
g
/c
g
/c
A F
A F
(n̂F )− d
(
)(
)
P
d
n̂F −d
.
η>0 η
2η
283
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
Proof: Let η be a random variable representing the first
hitting time. We assume that time is divided into slots and
each slot represent a stage game. It is easy to see that η =
n̂F + n̂A . At every slot, the random process has 2 possible
evolution directions, i.e., n̂F + 1 or n̂A + 1. Therefore, for
η slots, there are 2η possible realizations.
We try to calculate how many paths hit the boundary
exactly on the ηth slot. The following notations are made.
Let m = gcFA , s = (c0 − τ )/cF and m, s are integers. In
Figure 2, we interpret how to move on a grid according to
a random process. Consider a random walk from the left
bottom point. If n̂F increases, move one block right. If n̂A
increases, move m blocks up. While each block is a squarelet
with the length cF , the width of the grid is n̂F cF , the height
is gA n̂A , and diagonal line represents Ci = τ . Each walk
consists of η moves and must end on or beyond the upper
rightmost corner. What we are interested in is the number
of monotonic paths that wholly falls under the diagonal line,
because each of those paths is a realization of the random
process which hits the boundary for the first time at the ηth
slot.
While counting the number of realizations under the diagonal line might be difficult, we calculate the realizations that
do cross the line. Let the number of realizations crossing the
line be M , the number of realizations under the line is then
Cnn̂F −M , where Cnn̂F is the total number of possible realizations on the grid. Consider a sample realization crossing the
line as shown in Figure 2. Let d be the number of horizontal
steps taken in the path before hitting the diagonal line. To hit
the line, at least s+d
m vertical steps should be taken, covering
a total
height
of
(d
+ s)cF . The total number of such paths
P d
is
C
.
After
hitting the line, the rest of the path
s+d
d
m +d
should consist of n̂F − d vertical steps and the total number
total number of paths that
of moves left is η− s+d
m −d. So, the
Pn̂F −1 d
n̂F −d
cross the diagonal line is M = d
C s+d +d Cη−
.
s+d
−d
m
Fig. 2.
0.7
BNE monitor rate (q)
0.6
n̂
n̂F −1
d
+d
C
X Cηn̂F −
E[length] =
η
η>0
Pn̂F −1
d
s+d
m +d
d
2η
0.2
0
pe=0.01
e
pe=0.001
0
0.2
0.4
0.6
0.8
1
Attack success probability (γ)
Fig. 3. Equilibrium strategy q vs. the attack success rate in malicious node
detection game.
A. Malicious node detection game
We first present the simulation results on the malicious
node detection game. In Figure 3, we show how the monitoring probability in PBE strategy increases with the malicious
node attack success rate. The plots infer that the equilibrium require node j to increase its monitoring frequency
as the attack success rate increases. Also, as the channel
becomes more unreliable, node j must play M onitor more
frequently..
Figure 4 compares the convergence of node j’s belief
system when different attack gains are presented. The plots
are shown with pe = 0.01, γ = 0.95 and α = 0.01.
In Figure 4(a), we show how the belief system forms a
correct belief on the type of node i when only Attack is
observed. The convergence of the belief system under PBE
is illustrated in Figure 4(b). The plots suggest that the lower
the attack gain is, the quicker the belief system converges.
This property can be explained as follows. A smaller attack
gain requires node i to attack more often in order to get more
payoff, and increasing the attack frequency also increases the
risk of being successfully observed. With more observations,
the belief is updated more frequently and accurately. Belief
system converges slower in Figure 4(b) than in Figure 4(a)
because in the PBE, instead of constantly monitoring, node
j only monitors with probability q.
A more complete study on the convergence of the belief
system is shown in Figure 5. Plots in Figure 5(a) indicate
the larger the disguise cost cF /cA is, the less time it takes
to converge. This is because, with a larger disguise cost, it is
unprofitable for node i to disguise by forwarding packets. Instead, it will launch more attacks, thus increasing the chances
to be identified. Figure 5(b) shows a quicker converged belief
n̂F −d
s+d
−d
m
η−
η− s+d
m −d
n̂F −d
0.3
pe=0.2
game length being η is then
.
Finally, we can express the expected length of the post
detection game as
m
2η
0.4
p =0.1
m
C ds+d
0.5
0.1
sum up, out of 2η realizations, Cnn̂F −
PTo
n̂F −1 d
n̂F −d
C s+d +d Cη−
realizations hit the diagonal
s+d
d
m
m −d
line for the first time at the ηth move.
The probability of
P
Cη F −
Realizations of the random walk.
.
(40)
IV. S IMULATIONS
In this section, we study the properties of the perfect
Bayesian Nash equilibrium in the malicious node detection
game and the post-detection subgame perfect Nash equilibrium through simulations. In our simulator, two players play
the games repeatedly; the payoffs and strategy profiles for
each of the subgames are recorded to analyze the properties
of the equilibria.
284
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
1
1
0.95
0.95
0.9
0.9
0.85
0.85
0.11
0.8
*
*
A M
p
g /c =10
A M
0.06
j
0.056
g /c =20
0.07
0.75
0.055
g /c =5
A M
0.7
0.054
0.05
0.65
e
0.053
j
0.65
e
p =0.15
0.057
0.08
i
0.7
e
p =0.1
0.058
p
i
µ(t)(θ =1)
0.8
0.75
i
µ(t)(θ =1|a (t−1)=Attack)
p =0.01
0.059
0.1
0.09
0.6
0.6
g /c =10
A A
g /c =20
A A
0.55
0
10
20
30
40
number of games played (t)
(a) constant monitoring
Fig. 4.
0.03
A A
0.051
g /c =50
A A
0.5
0.052
A A
g /c =20
0.55
g /c =50
0.04
g /c =10
A A
50
0.5
0
10
20
30
40
0.02
50
number of games played (t)
0
5
10
15
0.05
20
0
10
20
number of games played (t)
(a) detection gain gA /cM
(b) PBE
30
40
50
number of games played (t)
(b) channel unreliability
0.1016
Belief system update with different attack gains.
c =g =τ
0
0.1015
A
c =g =2τ
0
A
c =2g =2τ
0.1014
0
A
c =2g =τ
system for a smaller detection gain. Figures 5(c) and 5(d)
relate the convergence with less errors and uncertainties in
the system. As expected, with errors and uncertainties (i.e.,
low channel loss, high attack success rate and low false alarm
rate), the belief system converges quickly.
Finally, the parameters affecting the PBE attack probability p are investigated in Figure 6. The attack gain is
a very important factor in determining the value of p as
shown in Figure 6(a). A large attack gain means more
payoff gained from an attack, which implies less number of
attacks are needed. Hence p should be smaller. Figures 6(b)
and 6(c) indicate that node i should attack less frequently
under a reliable channel as every attack is more likely to be
successful. However, as suggested in Figure 5(d), if the false
alarm rate is high for the regular node, the malicious node
can take advantage of it and attack more often.
0
0.1013
A
p
*
0.1012
0.1011
0.101
0.1009
0.1008
0.1007
0.1006
0
5
10
15
20
number of games played (t)
(c) coexistence index
Effects of parameters on the SPNE strategy p∗ .
expected length of the post−detection game
Fig. 7.
B. Post-detection game
After the belief system of node j converges (θi ≥ 0.99),
we can safely conclude that node j has detected the malicious
node. Therefore, the post-detection game starts. To show the
continuity, at the beginning of the post detection game, node
i sticks to its PBE strategy.
Figure 7 presents how the strategy profile p∗ evolves to
the SPNE strategy from the PBE. It is clear in the plots that
in the SPNE, node i should decrease its attack probability to
avoid isolation. Figure 7(a) shows a larger detection gain that
corresponds to a smaller attack rate; thus in the equilibrium,
the payoffs for node j will not increase due to the large
detection gain. Figure 7(b) states that if the channel is lossy,
node i should attack more often. The reason behind this claim
is that the more unreliable the channel is, the less probable
node j can accurately observe an attack. Plots in Figure 7(c)
are obtained from detection gain equals to 5. This figure
shows that the equilibrium is not sensitive to the initial value
and threshold of the coexistence index Ci .
The expected length of the post-detection game is shown
in Figure 8. First, the figure states that the less errors (i.e.,
less channel loss and more successful attack) in the system,
the longer the post-detection games can be played. Second,
the length of the game grows with the attack gain. This
interesting phenomena can be explained in the following
way. The larger attack gain enables the malicious node to
attack less while keeping its payoff high. Thus, more often,
the malicious node will play as a regular node to avoid
isolation. This will increase the time for the regular and
800
700
600
500
400
300
pe=0.01,γ=0.99
200
pe=0.1,γ=0.99
pe=0.01, γ=0.8
100
0
5
10
15
20
25
Attack gain (gA/cA)
Fig. 8.
Expected length of the post-detection game.
malicious nodes to coexist. This property can be used to
extend the lifetime of the network.
Last but not the least, we show how the network throughput can benefit from coexistence. Similar observations can
be made as the game length property. With a larger attack
gain, the malicious node decreases its attack rate and does
more packet forwarding as a regular node. Therefore, the
malicious nodes can be utilized to increase the throughput
more often as the attack gain grows. The throughput gain
property illustrates clearly that malicious and regular nodes
can coexist, and the coexistence equilibria improve the
throughput of the network.
V. C ONCLUSIONS
In this paper, we apply game theory to study coexistence
of malicious and regular nodes in a wireless network with
unreliable channels. We formulate a malicious node detection
game and a post-detection game played by the regular
and malicious nodes. While both games are of imperfect
information type, we show that the former game has a mixed
strategy perfect Bayesian Nash equilibrium and provide a
solution to achieve that equilibrium. For the latter game,
a coexistence index is proposed. We also prove that while
keeping the coexistence index above a threshold, the postdetection game has a subgame perfect Nash Equilibrium
285
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.
0.9
0.85
0.85
0.85
0.85
0.8
0.8
0.8
0.7
0.65
0.7
0.65
i
0.8
0.75
j
i
0.75
j
i
0.75
j
0.75
µ(t)(θ =1)
1
0.95
0.9
µ(t)(θ =1)
1
0.95
0.9
µ(t)(θ =1)
1
0.95
0.9
j
i
µ(t)(θ =1)
1
0.95
0.7
0.65
0.7
0.65
p =0.1, γ=0.95
e
0.6
0.6
c /c =10
A F
c /c =1
A F
0.55
A m
g /c =20
A m
0.5
0
10
20
30
40
0.5
0
10
number of games played (t)
20
30
40
0.5
50
0
Fig. 5.
5
10
15
30
40
50
(d) false alarm rate
0.12
0.11
p =0.01
p =0.1
0.12
e
α=0.01
α=0.1
γ=0.8
γ=0.99
e
A A
0.11
0.11
0.1
0.1
0.09
0.09
0.1
0.08
0.09
p
0.07
p
0.09
p
p
20
number of games played (t)
0.1
0.08
0.08
0.06
0.08
0.07
0.05
0.07
0.07
0.04
0.03
0.02
10
Effects of parameters on belief system update.
g /c =10
A A
0
(c) channel unreliability and attack
success rate
0.13
g /c =5
0.5
20
number of games played (t)
(b) detection gain gA /cM
0.12
α=0.01
α=0.1
α=0.05
0.55
e
number of games played (t)
(a) disguise cost cF /cA
0.11
e
p =0.01, γ=0.8
A m
50
0.6
e
p =0.01, γ=0.95
0.55
g /c =50
A F
p =0.1, γ=0.8
0.6
g /c =10
0.55
c /c =0.2
0
5
10
15
0.06
0.06
0.05
0.05
0
5
number of games played (t)
10
15
(b) channel unreliability
Fig. 6.
Throughput gain
0.85
pe=0.01,γ=0.99
pe=0.1,γ=0.99
pe=0.1,γ=0.8
5
10
15
20
25
Attack gain (g /c )
A A
Fig. 9.
15
0.05
0
5
10
15
number of games played (t)
(d) false alarm rate
[6] J. J. Jaramillo and R. Srikant, “DARWIN: distributed and adaptive
reputation mechanism for wireless ad-hoc networks”, ACM MobiCom
2007, pp. 87-97.
[7] Z. Ji, W. Yu and K. J. R. Liu, “Cooperation enforcement in autonomous
MANETs under noise and imperfect observation”, IEEE Secon 2006,
pp. 460-468.
[8] M. Kodlialam and T. V. Lakshman, “Detecting network intrusion via
sampling: a game theoretic approach”, IEEE INFOCOM 2003, pp.
1880-1889.
[9] D. M. Kreps and R. Wilson,“Sequential Equilibria”, Econometrica
50(4), pp.863-894, 1982.
[10] F. Li and J. Wu, “Hit and Run: A Bayesian Game Between Malicious
and Regular Nodes in MANETs”, IEEE SECON 2008, pp. 432-440.
[11] X. -Y. Li, Y. Wu, P. Xu, G. Chen and M. Li, “Hidden Information and
Actions in Multi-Hop Wireless Ad Hoc Networks”, ACM Mobihoc
2008, pp. 283-292.
[12] P. Liu, W. Zhang and M. Yu, “Incentive-based modeling and inference of attacker intent, objectives, and strategies”, ACM Trans. on
Information and System Security, 56(3), pp. 78-118, 2005.
[13] Y. Liu, C. Comaniciu and H. Man, “A Bayesian game approach for
intrusion detection in wireless ad hoc networks”, ACM GameNets
2006.
[14] A. B. Mackenzie and L. A. DaSilva, Game Theory for Wireless
Engineers, San Rafael, California: Morgan & Claypool Publishers,
2006.
[15] P. Michiardi and R. Molva, “Analysis of coalition formation and
cooperation strategies in mobile ad hoc networks”, Ad Hoc Networks,
3(2005), pp. 193-219.
[16] F. Milan, J. J. Jaramillo and R. Srikant, “Achieving cooperation in
multihop wireless networks of selfish nodes”, ACM GameNets 2006.
[17] M. J. Osborne, “An introduction to Game Theory”, Oxford University
Press, New York, NY, 2004.
[18] V. Srinivasan, P. Nuggehalli, C. F. Chiasserini, and R. R. Rao,
“Cooperation in wireless ad hoc networks”, Proceedings of IEEE
Infocom 2003, pp. 807-817.
[19] G. Theodorakopoulos and J. S. Baras, “Malicious Users in Unstructured Networks”, IEEE INFOCOM 2007, pp. 884-891.
[20] W. Wang, S. Eidenbez, Y. Wang and X.-Y. Li, “OURS: Optimal unicast
routing system in non-cooperative wireless networks”, ACM Mobicom
2006, pp. 402-413.
[21] S. Zhong, L. Li, Y. Liu and Y. Yang, “On Designing IncentiveCompatible Routing and Forwarding Protocols in Wireless Ad-Hoc
Networks—An Integrated Approach Using Game Theoretical and
Cryptographic Techniques”, ACM Mobicom 2005, pp. 117-131.
0.9
0
10
Effects of parameters on the PBE strategy p.
0.95
0.75
5
(c) attack success rate
1
0.8
0
number of games played (t)
number of games played (t)
(a) attack gain gA /cA
0.06
Throughput gain.
which is also the coexistence equilibrium for malicious and
regular nodes. Simulations are provided to illustrate the
properties of the equilibria. In particular, we show how
the system parameters like attack gain, attack success rate,
detection gain and channel loss affect the convergence of
the games and the equilibrium strategies. Simulation results
also state that the coexistence equilibrium helps to extend
the length of the games and improves the throughput of the
network.
R EFERENCES
[1] A. Agah, S. K. Das, K. Basu and M. Asadi, “Intrusion detection in
sensor networks: A non-cooperative game approach”, IEEE NCA 2004,
pp. 343-346.
[2] L. Anderegg and S. Eidenbenz, “Ad hoc-VCG: a truthful and costefficient routing protocol for mobile ad hoc networks with selfish
agents”, ACM Mobicom 2003, pp. 245-259.
[3] L. Buttyán and J. P. Hubaux, “Stimulating cooperation in selforganizing mobile ad-hoc networks”, ACM/Kluwer Mobile Networks
and Applications, 8(5), pp. 579-592.
[4] J. Crowcroft, R. Gibbens, F. Kelly, and S. Ostring, “Modelling
incentives for collaboration in mobile ad hoc networks”, Performance
Evaluation, 57(4), pp. 427-439.
[5] D. Fudenberg and J. Tirole, Game Theory, MIT press, Cambridge,
MA, 1991.
286
Authorized licensed use limited to: ROME AFB. Downloaded on July 21, 2009 at 09:43 from IEEE Xplore. Restrictions apply.