A Tag Encoding Scheme against Pollution Attack to Linear Network

A Tag Encoding Scheme against Pollution Attack to Linear Network
Coding
Abstract
Network coding allows intermediate nodes to encode data packets to improve network
throughput and robustness. However, it increases the propagation speed of polluted data packets
if a malicious node injects fake data packets into the network, which degrades the bandwidth
efficiency greatly and leads to incorrect decoding at sinks. In this paper, insights on new
mathematical relations in linear network coding are presented and a key pre distribution-based
tag encoding scheme KEPTE is proposed, which enables all intermediate nodes and sinks to
detect the correctness of the received data packets. Furthermore, the security of KEPTE with
regard to pollution attack and tag pollution attack is quantitatively analyzed. The performance of
KEPTE is competitive in terms of: 1) low computational complexity; 2) the ability that all
intermediate nodes and sinks detect pollution attack; 3) the ability that all intermediate nodes and
sinks detect tag pollution attack; and 4) high fault-tolerance ability. To the best of our
knowledge, the existing key pre distribution-based schemes aiming at pollution detection can
only achieve at most three points as described above. Finally, discussions on the application of
KEPTE to practical network coding are also presented.
Introduction
Network coding, where an intermediate node encodes incoming packets before
forwarding, has been theoretically proven to maximize network throughput and enhance network
robustness. It has received extensive attentions and been applied to various computer network
systems, such as multicast networks, wireless networks, and P2P systems. However, when there
is a malicious node in a network and the malicious node injects fake data packets into its
downstream nodes, the fake data packets will be encoded together with correct data packets by
the downstream nodes and the outputs of the downstream nodes will be polluted and fake. The
pollution propagates in the network quickly with the transmission of polluted data packets, which
not only leads to incorrect decoding at sinks, but also wastes network resources. So it is crucial
to prevent pollution attacks in practical applications of network coding. The existing approaches
against pollution attack to network coding can be categorized into three classes, error correction,
malicious node localization, and pollution detection. Error correction-based approaches, only
allow a small portion of packets to be polluted, and focus on correcting error only at sinks. The
pollution will inevitably propagate in networks with t he error correction approaches.
Some schemes are proposed to locate malicious nodes and make those nodes unable to
further inject polluted data packets. Those schemes either assume that there exists a powerful
controller that knows the entire topology of a network or assume a clock synchronization of all
nodes in a network. Those schemes are of limited practicality when multiple malicious nodes
exist. The existing schemes of pollution detection, which are mainly based on key delay
distribution, public key cryptography (PKC), and key pre distribution, focus on detecting and
filtering fake or polluted data packets at intermediate nodes or sinks directly and can prevent
pollution propagation efficiently. The schemes based on key delay distribution require a clock
synchronization of all the nodes in a network. So it is difficult to implement them in an
adversarial distributed environment. The implementation of the schemes based on PKC or key
pre distribution are relatively simple. However, the PKC-based schemes in require a large field,
for example, the field is F397, which implies that the computational complexities of PKC-based
schemes are very high. The schemes are based on key pre distribution. The computational
complexities of the schemes are low, but they experience tag pollution attack that leads to
numerous correct data packets being discarded. With the scheme the correctness of a data packet
will be verified after several hops (usually at least 3 hops). The MacSig scheme requires a field
of size 128 bits with very high computational complexity. In addition, even if the schemes in
tolerate less nodes to be compromised, very large redundancy is needed to append to each data
packet. For example, if the most recent MacSig scheme allows 15 nodes to be compromised with
a probability 99.5 percent, the size of redundancy appended to an IP data packet of 1,400 bytes
will be 5,144 bytes. This paper will present some insights on mathematical relations in linear
network coding and propose a key pre distribution-based tag encoding (KEPTE) scheme to
detect polluted data packets. The basic idea of KEPTE is as follows: A source s uses N keys to
generate N tags for each data packet. Each node g except the source holds a unique pair of keys
(Zg,Vg), where Zg,Vg, and the N keys held by s satisfy a certain relation.
Literature Review
Cooperative Defense Against Pollution Attacks in Network Coding Using SpaceMac
A novel homomorphic MAC scheme called SpaceMac, which allows an intermediate
node to verify if its received packets belong to a specific subspace, even if the subspace is
expanding over time. Then, we use SpaceMac as a building block to design a cooperative
scheme that provides complete defense against pollution attacks: (i) it can detect polluted packets
early at intermediate nodes and (ii) it can identify the exact location of all, even colluding,
attackers, thus making it possible to eliminate them. Our scheme is cooperative: parents and
children of any node cooperate to detect any corrupted packets sent by the node, and nodes in the
network cooperate with a central controller to identify the exact location of all attackers. We
implement SpaceMac in both C/C++ and Java as a library, and we make the library available
online. Our evaluation on both a PC and an Android device shows that (i) SpaceMac’s
algorithms can be computed quickly (28s in C/C++) and efficiently (64 KB in C/C++), and (ii)
our cooperative defense scheme has both low computation (190s in C/C++) and low
communication (2%) overhead, significantly less than other comparable state-of-the-art schemes.
The construction of SpaceMac is new and significantly more efficient than our old
construction. This construction is an improvement of the homomorphic MAC construction,
HomMac. The key difference between SpaceMac and HomMac is the security property that
SpaceMac brings: SpaceMac allows for signing spaces that expand over time, while HomMac
only allows for signing fixed spaces. This is directly reflected in the difference between the
security games of SpaceMac and HomMac, and consequently the difference between the
constructions. SpaceMac improves and generalizes of HomMac. Therefore, SpaceMac can also
be used to sign fixed spaces as well, e.g., in the place of HomMac in the detection scheme.
However, it is not possible to use HomMac to support either our detection scheme or our
locating scheme because any intermediate node does not know about the entire source space (to
use Sign of HomMac) at the time of tag generation.
A (q,n,m) homomorphic MAC scheme is defined by three probabilistic, polynomial-time
algorithms: Mac, Combine, and Verify. The Mac algorithm generates a tag for a given vector;
the Combine algorithm computes a tag for a linear combination of some given vectors; and the
Verify algorithm verifies whether a tag is a valid tag of a given vector.
Identifying Malicious Nodes in Network-Coding-Based Peer-to-Peer Streaming Networks
A novel scheme (MIS) to limit network-coding pollution attacks by identifying malicious
nodes. MIS can fully satisfy the requirements of P2P live streaming systems. It has high
computational efficiency, small space overhead, and the capability of handling a large number of
corrupted blocks and malicious nodes, and does not require repeatedly pre-distributing
verification information. MIS is block-based and can repaidly identify malicious nodes, ensuring
that the system quickly recovers from pollution attacks.
Researchers show that network coding can greatly improve the quality of service in P2P
live streaming systems (e.g., IPTV). However, network coding is vulnerable to pollution attacks
where malicious nodes inject into the network bogus data blocks that will be combined with
other legitimate blocks at downstream nodes, leading to incapability of decoding the original
blocks and degradation of network performance. In this paper, we propose a novel approach to
limiting pollution attacks by identifying malicious nodes. In our scheme, the malicious nodes can
be rapidly identified and isolated, so that the system can quickly recover from pollution attacks.
Our scheme can fully satisfy the requirements of live streaming systems, and achieves much
higher efficiency than previous schemes. Each node in our scheme only needs to perform several
hash computations for an incoming block, incurring very small computational latency in the
range of several microseconds. The space overhead added to each block is only 20 bytes. The
verification information given to each node is independent of the streaming content and thus
does not need to be redistributed. The simulation results based on real PPLive channel overlays
show that the process of identifying malicious nodes only takes a few seconds even in the
presence of a large number of malicious nodes.
The correctness of MIS is based on the condition that no node can lie when reporting a
suspicious node that has sent a corrupted block, and an evidence associated with the corrupted
block is necessary to demonstrate to the servers that the reported node has really sent the block.
We design a non-repudiation transmission protocol to achieve this. We let X be the suspicious
node, and Y be the reporting node.
In network-coding pollution attacks, malicious nodes send bogus blocks to their
downstream neighbors. An innocent node that receives a bogus block is infected, and the blocks
it produces with this bogus block are also corrupted and may further infect its downstream peers.
The contaminated links and infected nodes form a connected component (or several components
if multiple malicious nodes exist) in the network. Our goal is to track the origin of corrupted
blocks. To achieve this, we first propose an approach to detecting the existence of malicious
nodes.
If the decoding result matches the specific formats of video streams, and any node having
an inconsistent decoding result will send an alert to the servers to trigger the process of
identifying malicious nodes. From the standpoint of the whole overlay, the existence of
malicious nodes can be detected with very high probability. After receiving an alert, the servers
compute a checksum based on the original blocks, and disseminate it to the nodes using the
streaming overlay. The checksum will help the nodes identify which neighbor has sent it a
corrupted block. The checksum, node I (or K) finds out that the block received from node F (or
J) is corrupted. Note that, at this point, one cannot confirm that the discovered nodes ( F, J) are
malicious, since they may be innocent and receive corrupted blocks from their upstream
neighbors. For instance, J is an innocent node and the corrupted block it produces is due to the
bogus block it has received from F; whereas, node F discovered by I is indeed a malicious node.
To this end, the discovered nodes (F, J) are temporarily treated assuspicious nodes, and are
reported (by I, K) to the servers. Then, the reporting nodes (I,K) further forward the checksum to
their suspected upstream neighbors (I,K).
If a suspected node is truely innocent (e.g., J), then with the received checksum it will
identify at least one corrupted block it has received from its upstream neighbors (i.e., F), and
correspondingly it will report its suspected neighbors (F) to the servers. On the contrary, if a
suspected node is malicious (e.g., F), it cannot find a suspicious neighbor that sent it a corrupted
block. Therefore, we let the servers judge a suspicious node based on whether it can report
another suspicious node. The correctness of the above process of identifying malicious nodes
relies on the condition that no one can lie when reporting a suspicious node. To be concrete, any
malicious node cannot disparage an innocent node that does not send a corrupted block, or
cannot deny having sent a corrupted block (when being suspected). For example, F cannot
disparate C, or deny having sent a bogus block to I.
Homomorphic MACs: MAC-based Integrity for Network Coding
A homomorphic MAC suitable for networks using network coding. The homomorphic
MAC can be converted to a broadcast homomorphic MAC using cover free set systems. The
resulting broadcast MAC is collusion resistant up to a pre-determined collusion bound c. The tag
size grows quadratically with c. Network coding has been shown to improve the capacity and
robustness in networks. However, since intermediate nodes modify packets en-route, integrity of
data cannot be checked using traditional MACs and checksums. In addition, network coded
systems are vulnerable to pollution attacks where a single malicious node can flood the network
with bad packets and prevent the receiver from decoding the packets correctly. Signature
schemes have been proposed to thwart such attacks, but they tend to be too slow for online perpacket integrity. They also force network coding coefficients to be picked from a large held
which causes the size of the network coding header to be large. Here we propose a homomorphic
MAC which allows checking the integrity of network coded data. Our homomorphic MAC is
designed as a drop-in replacement for traditional MACs (such as HMAC) in systems using
network coding.
The tag on a vector v is a single element in F q, there is a homomorphic MAC adversary
that can break the MAC (i.e. win the MAC security game) with probability 1=q. When q= 28 the
MAC can be broken with probability 1=256. Security can be improved by computing multiple
MACs per vector. For example, with 8 tags per vector security becomes 1=q8. The resulting tag
is 8 bytes long. The proof of Theorem 2 easily extends to prove these bounds for HomMac using
multiple tags. We note, however, that a homomorphic MAC with security 1=256 may be
sufficient for the network coding application. The reason is that the homomorphic MAC is only
used by recipients to drop malformed received vectors. The sender can, in addition, compute a
regular MAC (such as HMAC) on the transmitted message prior to encoding it using network
coding. Recipients, after decoding a matrix of vectors with valid homomorphic MACs, will
further validate the HMAC on the decoded message and drop the message if its HMAC is
invalid. Hence, success in defeating the homomorphic MAC does not mean that a rogue message
is accepted by recipients. It only means that recipients may need to do a little more work to
properly decode the message (by trying various m-subsets of the received vectors with a valid
homomorphic MAC). As mentioned above, this issue can be avoided by increasing the security
of the homomorphic MAC by computing multiple tags per vector.
An Efficient Signature-based Scheme for Securing Network Coding against Pollution
Attacks
An efficient signature-based scheme against pollution attacks for securing linear network
coding. Our scheme utilizes a novel homomorphic signature function and allows a source to
delegate its signing authority to forwarders, which means that the forwarders can generate the
signatures for their output messages without contacting the source. This property allows the
forwarders to verify received messages, but prevent them from creating the valid signatures for
polluted or forged ones. Thus, the pollution attacks can be efficiently and effectively filtered out
at the forwarders. Our scheme does not need any extra secure channels, and can support source
authentication and batch verification. Experimental results show that our scheme can improve
verification efficiency up to 10 times compared to some existing one. In addition, we presented
an alternate lightweight scheme which utilizes a simpler signature function. This scheme is much
faster and more suitable for resource-constrained networks such as wireless sensor networks.
However, it introduces a trade-off between computation efficiency and security.
Network coding provides the possibility to maximize network throughput and receives
various applications in traditional computer networks, wireless sensor networks and peer-to-peer
systems. However, the applications built on top of network coding are vulnerable to pollution
attacks, in which the compromised forwarders can inject polluted or forged messages into
networks. Existing schemes addressing pollution attacks either require an extra secure channel or
incur high computation overhead. In this paper, we propose an efficient signature-based scheme
to detect and filter pollution attacks for the applications adopting linear network coding
techniques. Our scheme exploits a novel homomorphic signature function to enable the source
to delegate its signing authority to forwarders, that is, the forwarders can generate the signatures
for their output messages without contacting the source. This nice property allows the forwarders
to verify the received messages, but prohibit them from creating the valid signatures for polluted
or forged ones. Our scheme does not need any extra secure channels, and can provide source
authentication and batch verification. Experimental results show that it can improve computation
efficiency up to ten times compared to some existing one. In addition, we present an alternate
lightweight scheme based on a much simpler linear signature function. This alternate scheme
provides a tradeoff between computation efficiency and security.
A Homomorphic Signature Scheme, the source generates the signature for each message
using RSA private key and appends the signature to the corresponding message. Other nodes
(including both forwarders and sinks) verify the received messages using the source’s public key.
Our scheme is based on a homomorphic signature function, which guarantees that any encoded
message’s signature can be composed from those of input messages (that are used to generate the
encoded message). Thus, a forwarder can generate the valid signatures for its output messages
even without knowing the source’s private key, as long as every node appends the signatures to
its output messages. The nice property of our scheme is that the forwarders have no ability to
create the valid signature for any polluted or forged message.
The problem of securing network coding systems against the pollution attacks. Our goal
is to design an efficient scheme for filtering pollution attacks in network coding systems,
especially in those resource-constrained networks such as wireless sensor networks that adopt
network coding. The scheme should enable the forwarders to detect and filter polluted messages
as early as possible and must be highly efficient for the use in resource-constrained networks in
terms of computation. In addition, it does not require any extra secure channels.
The Framework of Hashing or Signature Schemes, e first propose a framework of the
hashing or signature schemes include our scheme and existing ones that address the pollution
attacks for linear network coding systems. In this framework, we roughly divide the schemes into
three phases. Parameter setup phase: In this phase, the source chooses security parameters,
calculates its private and public keys, and determines the hash or signature function. Hash or
signature calculation phase: In this phase, the source calculates the hashes or signatures for its
messages. These hashes or signatures may be securely transmitted to forwarders and sinks, or
directly attached to the messages Message verification phase: In this phase, the forwarders and
the sinks verify the received messages. Verification is based on the encoding vectors embedded
into the messages, the hashes or signatures of the messages, and the source’s public keys. If
verification succeeds, the received messages will be accepted and used for further encoding or
decoding. Otherwise, they are discarded. The first phase can be done offline, but the other two
must be executed online. So, the efficiency of schemes is mainly determined by the computation
overhead occurring in the second and third phases.
An Authentication Code against Pollution Attacks in Network Coding
An unconditionally secure authentication scheme that provides multicast linear network
coding with message integrity protection and source authentication. The resulting scheme offers
robustness against pollution attacks from outsiders and from k−1 insiders. Our solution allows
the source to generate authentication tags for up to M messages with the same key and the
intermediate nodes to verify the authentication tags of the packets received an d thus to detect
and discard the malicious packets that fail the verification. The performance analysis showed that
our scheme offers good put gains that tend towards 1 with increasing corrupted packets in the
network. Our scheme can be used to authenticate with the same key a file download of at most
18 N2 MBytes when one tagged message is carried over N IP packets.
Systems exploiting network coding to increase their throughput suffer greatly from
pollution attacks which consist of injecting malicious packets in the network. The pollution
attacks are amplified by the network coding process, resulting in a greater damage than under
traditional routing. It we address this issue by designing an unconditionally secure authentication
code suitable for multicast network coding. The proposed scheme is robust against pollution
attacks from outsiders, as well as coalitions of malicious insiders. Intermediate nodes can verify
the integrity and origin of the packets received without having to decode, and thus detect and
discard the malicious messages in-transit that fail the verification. This way, the pollution is
canceled out before reaching the destinations. We analyze the performance of the scheme in
terms of both multicast throughput and good put, and show the good put gains. We also discuss
applications to file distribution network coding authentication codes, let us start by recalling the
setting for classical authentication schemes model for unconditionally secure authentication
where one transmitter communicates to multiple receivers who cannot all be trusted. In this
scenario, the transmitter first appends a tag to a common message which is then broadcasted to
all the receivers, who can separately verify the authenticity of the tagged message using their
own private secret key. There is among the receivers a group of malicious receivers, who use
their secret key and all the previous messages to construct fake messages. A (k, V) multi-receiver
authentication system refers to a scheme where V receivers are present, among which at most
k−1 can cheat. The malicious nodes can perform either an impersonation attack, if they try to
construct a valid tagged message without having seen any transmitted message before, or a
substitution attack, if they first listen to at least one tagged message before trying to fake a tag in
such a way that the receiver will accept the tagged message. Perfect protection is obtained if the
best chance of success in the attack is 1/|T| where |T| is the size of tag space, namely, the attacker
cannot do better than make a guess, and pick randomly one tag. The scheme has been
generalized to the case where the same key can be used to authenticate up to M messages. The
network coding scenario that we consider a multicast setting, where one source wants to send a
set of messages to T destinations. In order to propose a definition of network coding
authentication scheme, let us first understand the main differences with respect to the classical
multi-receiver scenario: 1. The source does not broadcast the same message on all its outgoing
edges, but sends different linear combinations of the n messages x1, . . . ,xn, which means that
the key used by the source to sign the messages will be used more than once, actually at least as
many times as there are outgoing edges from the source. 2. We are interested in a more general
network scenario, where intermediate nodes play a role. In particular, it is relevant in the context
of pollution attacks that not only destination nodes but also intermediate nodes may check the
authenticity of the packets. We call such nodes in the network verifying nodes. This set may
include part or all of the destination nodes D1, . . . , DT. This makes a big difference in network
coding, since while the destination nodes do have a transfer matrix to recover the message sent,
this is not the case of regular intermediate nodes, which must perform the authentication check
without being able a priori to decode.
A Source S, which wants to multicast n messages to T destinations D1, . . . , DT. We will
denote the set of messages by s1, . . . ,sn to refer to the actual data to be sent, while we keep the
notation x1, . . . ,xn∈FNq for the whole packets, including the authentication tag. Each message s
I is of length l,si= (si,1, . . . , si,l), so that while each symbols i,j belongs to Fq, we can see the
whole message as part of Flq≃Fql. We also assume a set of nodes R1, . . . , RV which can verify
the authentication. A priori, this set can include the destinations, but can also be larger .
Typically we will assume that V >> T, in the context of pollution attacks.
RIPPLE Authentication for Network Coding
An efficient in-network authentication scheme that is well-matched to distributed random
linear network codes. Operating in tandem, they provide a practical and low-complexity scheme
for achieving rate- optimal throughput even in the presence of a disruptive adversary in the
network. The low complexity of RIPPLE’s authentication scheme arises from using symmetric
key authentication together with time-asymmetry, i.e., keys are transmitted after their
corresponding messages. Hence we achieve security akin to that of public-key verification
schemes, without their undesirable properties (prohibitive computational complexity for
moderate key sizes). We show that RIPPLE is robust to arbitrary collusion among adversaries. It
is also resilient against the subtle tag pollution attack discussed in this work, where in an
adversary injected packet with a single corrupted tag may cause nodes in prior schemes to drop
numerous packets allowing routers to randomly mix the information content in packets before
forwarding them, network coding can maximize network throughput in a distributed manner with
low complexity. However, such mixing also renders the transmission vulnerable to pollution
attacks, where a malicious node injects corrupted packets into the information flow. In a worst
case scenario, a single corrupted packet can end up corrupting all the information reaching a
destination. It use RIPPLE, a symmetric key based in-network scheme for network coding
authentication. RIPPLE allows a node to efficiently detect corrupted packets and encode only the
authenticated ones. Despite using symmetric key based homomorphic Message Authentication
Code (MAC) algorithms, RIPPLE achieves asymmetry by delayed disclosure of the MAC keys.
Our work is the first symmetric key based solution to allow arbitrary collusion among
adversaries. It is also the first to consider tag pollution attacks, where a single corrupted MAC
tag can cause numerous packets to fail authentication farther down the stream, effectively
emulating a successful pollution attack. Under our settings, the goals of in-network
authentication of network coding are as follows: In-network source authentication. Any node in
the network can verify that the received data originates from the source and was not modified enroute. Arbitrary collusion resistant. An honest node can verify the authenticity of the received
data even in the presence of arbitrary colluding adversaries. Light-weight operation. Forwarders
need little extra power to process message authentication. With these goals in mind, we present
our RIPPLE scheme for network coding authentication in the next two sections. The main set of
ideas behind our scheme are: 1) designing homomorphic MACs to allow in-network tag
regeneration for coded messages, 2) utilizing symmetric key cryptography for light-weight innetwork message authentication, and 3) using time to create asymmetry in message
authentication; RIPPLE has two main components. The first is a homomorphic MAC for
broadcast authentication. The second is the RIPPLE transmission protocol for message
distribution and authentication. Our homomorphic MAC is information theoretically secure,
whereas the security of the RIPPLE transmission protocol relies on computational hardness A
homomorphic MAC allows to linearly combine any sequence of given (vector, tag) pairs. That
is, given a sequence of pairs {(Mi,ti)}mi=1, where M1,M2,...,Mm∈Fn+mq, anyone can create a
valid tag t for the vector y=Pmi=1αiMi for any α1,α2,...,αm∈Fq. Loosely speaking, the security
requirement is that, even under an adaptively chosen message attack, creating a valid tag t for a
vector outside the linear span of the original messages is only possible with negligible
probability. Syntax. We define a (q,n,m) homomorphic MAC via fourprobabilistic, polynomial
time (PPT) algorithms, (Generate,MAC,Verify,Combine): Generate is a PPT algorithm that
randomly samples a key K from a key space K. MAC is a PPT algorithm that takes as input a
secret key K, a vector M and outputs a tag t for M. The MAC algorithm authenticates a vector
space spanned by M1,M2,...,Mm∈Fn+mq by running MAC(K,Mi) for i=1,...,m. This produces a
tag for each of the basis vectors. To authenticate a message (whose packets form the subspace
spanned by M1,M2,...,Mm) the sender transmits the pairs (Mi,ti).
In RIPPLE Transmission Protocol, a sender S schedules packet transmission based on
two coordinates, space and time. The network space is hierarchically organized. A node is called
a level j node if it is at most j hops away from S. We define the children of S to be level-1 nodes
and assume a maximum of L levels in a network. Time is divided into uniform intervals. S sends
zero or multiple packets during an interval. We refer to packets sent during interval I as interval-I
packets. We name our network coding authentication scheme RIPPLE for packets moving in the
network from level to level, in a wavelike fashion. A batch of packets reaches the nodes at a
level, pauses for key disclosure, verification, coding, and finally flows to the next level. For ease
of understanding, we first describe the RIPPLE scheme at a high level, and then break it into four
stages and elaborate on each of them. S broadcasts a batch of packets in each time interval and a
batch moves from level to level driven by delayed key disclosures. Specifically, assume that S
transmits a batch of packets to its children during interval I. Each packet carries L tags, each for
verification by nodes at a different level. A tag is produced with a key generated specifically for
a particular interval and level. S reveals the key for interval I and level 1 after a fixed amount of
delay to ensure that all level-1 nodes have received the interval-I packets. Upon receiving the
key, a level-1 node verifies all the buffered interval-I packets by checking their level-1 tags and
encodes only the authenticated packets. Due to the homomorphic property of our MAC, the
level-1 nodes are able to produce new tags for the encoded packets for the remaining levels.
Driven by one key disclosure, this batch of packets moves to level-2 nodes. Based on a
predetermined key disclosure schedule, S distributes the keys for levels 2 through L in an
increasing order of levels. Every key disclosure drives the batch one level forward. Eventually,
this batch of packets reaches all the receivers. A. Sender Setup We now describe our RIPPLE
transmission protocol in four stages. Recall that time is divided into intervals of a uniform
length. We use the following notations: T0 denotes the starting time of a multicast transmission,
N the maximum number of intervals in a multicast session, W the maximum number of packets
sent in an interval, Ti the starting time of interval i, and δ the interval length. For the last three
variables, we have the following equation Ti = T0+iδ,∀1≤i≤N. In the setup stage, S determines
the network level L and network delay D. 1) Determining Network Level L: For each node
v∈V−{S}, its level Lv(v) on a coding graph G=(V,E) is defined as the length (i.e., hop count) of
the longest path from S to v. Network level L is thus defined as L, max v∈V−{S}Lv(v). For
example, the level of node d on the butterfly coding graph, and the network level is 5. Given a
coding graph 4, finding the length of the longest path from S to a node v∈V can be transformed
to finding the shortest path between S and v by changing the signs of the weights on the edges.
Hence, by using existing shortest path algorithms such as the Dijkstra’s algorithm, the length of
the longest path can be computed in O(|V|log|V|+|E|) time when no adversary exists. In the
presence of adversaries, the length of the longest path can be computed by using recently
proposed passive network tomography schemes for network coding based multicast. Bounding
Network Delay D: We define network delay D as the sum of these two terms: RTT(S,v), the
maximum round trip time between S and a node v∈V−{S}, and Tp, the bound on a node’s
processing time to authenticate up to W messages of an interval, encode authenticated messages,
calculate new tags for the encoded messages, and forward the newly generated packets to the
next level. More precisely D is thus an upper bound on the maximum time for messages of an
interval to move one level forward. We emphasize that RIPPLE only requires S to know the
upper bounds of L and D to operate. However, knowledge of accurate values of L and D reduces
both the communication and computational overhead. 3) One-way Key Chain: For each level of
nodes, we generate a different sequence of random values and use them in reverse order to derive
the MAC keys each for a different time interval. We use a pseudo-random function F to generate
the sequences. For each sequence, we choose a different random number r0 as the base value and
generate the sequence recursively as ri=F(ri−1) where 1≤i≤N(recall that N is the maximal
number of intervals in a broadcast session). We refer to the sequence as the one-way key chain.
To authenticate the values in a one-way key chain, S signs the last value r N with a common
public key signature scheme such as RSA or DSA. We refer to the signed r N a commitment to
the key chain. Anyone can verify the authenticity of rN with S ’s public key. Given an
authenticated rN and an index i, anyone can check if a value r is the ith value in the one-way key
chain by checking if r N=FN−i(r), where Fn(x) denotes n consecutive applications of F. By
convention, F0(x)=x.
Existing System

The existing approaches against pollution attack to network coding can be categorized
into three classes, error correction, malicious node localization, and pollution detection.

Error correction-based approaches only allow a small portion of packets to be polluted,
and focus on correcting error only at sinks. The pollution will inevitably propagate in
networks with the error correction approaches.

Some schemes are proposed to locate malicious nodes and make those nodes unable to
further inject polluted data packets.

The existing schemes of pollution detection, which are mainly based on key delay
distribution, public key cryptography (PKC), and key pre distribution, focus on detecting
and filtering fake or polluted data packets at intermediate nodes or sinks directly and can
prevent pollution propagation efficiently.
 Disadvantages: The schemes in based on key delay distribution require a clock
synchronization of all the nodes in a network. So it is difficult to implement them in an
adversarial distributed environment.
Proposed System

This paper proposed a key pre distribution-based tag encoding scheme KEPTE, which
enables all intermediate nodes and sinks to detect the correctness of the received data
packets.

Furthermore, the security of KEPTE with regard to pollution attack and tag pollution
attack is quantitatively analyzed.

Before presenting KEPTE, this paper define three basic algorithms, namely Sign,
Combine, and Verify.

The algorithm Sign computes N tags for each of the n data packets P1,P2,... ,Pn, where N
is a security parameter.

Given multiple data packets, each with N tags, Combine produces N tags for a linear
combination of those multiple data packets.

Verify checks the correctness of a data packet with its N tags.
 Advantages: The advantage of KEPTE is 1) low computational complexity; 2) the ability
that all intermediate nodes and sinks detect pollution attack; 3) the ability that all
intermediate nodes and sinks detect tag pollution attack; and 4) high fault-tolerance
ability.
Hardware Requirements
 Processor Type
: Pentium IV
 Speed
: 2.4 GHZ
 RAM
: 256 MB
 Hard disk
: 20 GB
 Keyboard
: 101/102 Standard Keys
 Mouse
: Scroll Mouse
Software Requirements
•
Operating System
: Windows 7
•
Programming Package
: Net Beans IDE 7.3.1
• Coding Language
: JDK 1.7
System Architecture:
KDC
Key Distribution
2 Secret Vector
N Secret Vectors
Source
Tag
Generation
Intermediate
node or Sink
Packet Transmission
Encoding
Verification
Attack
Detection
Modules
1. Network Formation
2. Key Distribution
3. Tag Generation & Encoding
4. Verification
5. Attack Detection
Network Formation

This network has one sink node and many sensor node. Source node send a file to sink
through many intermediate nodes.

Network coding, where an intermediate node encodes incoming packets before
forwarding, has been theoretically proven to maximize network throughput and enhance
network robustness.
Key Distribution

KDC is a common tool for key distribution. If KDC is not available, key distribution
based on PKC can replace the function of KDC. By PKC, the source distributes secret
information, which is also secret keys, to each node g except the source in a network.

KDC generate private and public key pairs for each file blocks and announce the public
key, both source and intermediate node g.

It generate secret vector from private keys. This secret vector used for tag generation.
Tag Generation & Encoding

The source s is to multicast a file of n data blocks B1,B2,...,Bn. These data blocks are
encoding by using secret vectors.

The source uses the algorithm Sign(X1,...,XN,Pi) to generate N tags tPi,1,...,tPi,N.

The algorithm Sign computes N tags for each of the n data packets P1,P2,... ,Pn, where N
is a security parameter.
Verification

An intermediate node g receives a data packet W with its N tags tW,1,...,tW,N, a node g
checks the correctness of W with algorithm Verify and its secret vectors Zg, Vg.

If its output is true, g judges W being correct; otherwise, g judges W being fake or
polluted, and discards W.
Attack Detection

Here KEPTE detect two types of attacks. One is Pollution Attack. Another one is Tag
Pollution Attack.

A Malicious node can inject fake data packets into its output links. The objective of
pollution attack is to make intermediate nodes or sinks unable to detect error data
packets, which not only leads to incorrect decoding at sinks but also makes polluted data
packets be transmitted in a network, leading to bandwidth waste. But KEPTE Detect
Pollution Attack.

A malicious node can modify the verification tags of a correct data packet, and injects the
correct data packet with modified tags to its output links, which is called tag pollution
attack. The objective of tag pollution attack is to get correct data packets be judged as
wrong and be discarded by intermediate nodes or sinks, which wastes bandwidth greatly.
But all intermediate nodes and sinks detect tag pollution attack using KEPTE.
Data Flow Diagram
Tag Generation
Attack Detection
Verification
Encoding
Key Distribution
Tag Generation
Verification
Attack Detection
References

A. Le and A. Markopoulou, “ Cooperative Defense against Pollution Attacks in Network
Coding Using SpaceMac,” IEEE J. Selected Areas in Comm. on Cooperative Networking
Challenges and Applications, vol. 30, no. 2, pp. 442-449, Feb. 2012.

Q. Wang, L. Vu, K. Nahrstedt, and H. Khurana, “Identifying Malicious Nodes in
Network-Coding-Based Peer-to-Peer Streaming Networks,” Proc. IEEE INFOCOM,
Mar. 2010.

S. Agrawal and D. Boneh, “Homomorphic MACs: MAC-Based Integrity for Network
Coding,” Proc. Int’l Conf. Applied Cryptography and Network Security, June 2009.

Z. Yu, Y. Wei, B. Ramkumar, and Y. Guan, “An Efficient Scheme for Securing XOR
Network Coding against Pollution Attacks,” Proc. IEEE INFOCOM, Apr. 2009.

F. Oggier and H. Fat hi, “ An Authentication Code against Pollution Attacks in Network
Coding,” IEEE/ACM Trans. Networking, vol. 19, no. 6, pp. 1587-1596, Mar. 2011.

Y. Li, H. Yao, M. Chen , S. Jaggi, and A. Rosen, “RIPPLE Authentication for Network
Coding,” Proc. IEEE INFOCOM, Mar. 2010.