TorrentTrust: A Trust-Based, Decentralized Object Reputation Network

TorrentTrust: A Trust-Based, Decentralized Object Reputation Network
Ian Sibner
Evelyn Yeung
David Xu
Quanze Chen
{isibner, eyeung, davix, cquanze}@seas.upenn.edu
Faculty Advisor: Andreas Haeberlen
April 25, 2016
Abstract
work usability. Even with these centralized forums, human inspection through reading comments is still required
to make decisions about trustworthness of objects.
We aim to solve these problems with TorrentTrust, a
distributed object reputation network for P2P networks,
specifically BitTorrent. Users over time will be able to
determine authenticity of objects in the network using
a combination of correlating voting histories with other
users and building a network of trusted users. This improves upon existing solutions in two major ways. By
removing any centralized aspect of the system, the system is less prone to certain types of attacks. Also, trust
rankings may be more accurate by incorporating the concept of a trusted user network.
This system will benefit users who use it actively since
they will be able to better protect themselves from potential malware and spam, and not have to worry about
an outage of a central authority.
In this paper, we describe TorrentTrust, a decentralized
object reputation system for peer-to-peer networks. Torrents are a popular target for spammers and hackers – an
easy way to coax users into downloading and installing
a profitable (for the hacker) piece of malware disguised
as another file. Thus, determining the authenticity of
a torrent has long been an issue. Many trackers use upvote/downvote systems, or allow users to verify a torrent,
but bad actors can easily verify their own malicious content. Also, these systems are totally centralized, creating
a single point of attack for adversaries.
We researched a system called Credence (Walsh and
Sirer, 2005), which was used to rank objects on the
Gnutella network, and extended it to provide stronger
security. The resulting system, TorrentTrust, verifies torrents based on trust relationships between users in a totally decentralized way - making it much more difficult
for bad actors to promote malicious content.
TorrentTrust is a layer on top of the BitTorrent fileshar2 Background
ing network where users can determine authenticity of
content through voting and establishing trust with other
TorrentTrust is based on the Credence system (Walsh and
users. Although similar to Credence, we show through
Sirer, 2005), an object reputation system for the Gnutella
simulation and analysis that it is more resistant to cerfilesharing network. In Credence, users endorse content
tain network attacks.
by voting positively or negatively on one or more of its
properties (for example, its file type). In order to evaluate
an unknown object, a client A first gathers all votes about
1 Introduction
that object, and then calculates the weighted sum of all
The rise of popularity in peer-to-peer networks causes a such votes to determine the object score. The weight of
growth in content, but includes malicious content and each vote is a measure of past voting correlation between
spam. Determining the trustworthiness of content in A and the client B which cast the vote. Specifically, if
these networks is increasingly difficult. Content could be A+ and B+ are the fraction of positive votes for A and
mislabeled, fradulent, or nonexistent, or it could contain B, respectively, and (AB)++ is the fraction where both
(AB)++ − A+ B+
viruses, spyware, and worse.
voted positively, then rAB = p
A+ (1 − A+ )B+ (1 − B+ )
It is difficult to build a reliable trust system on top
of a distributed network. Some notable existing solu- is the coefficient of correlation, which is used to weight
tions include centralized forums such as The Pirate Bay the actual vote, which is in {−1, +1}.
where people can vote and rank content. However, this
Credence relies on a central certificate authority to aldefeats some of the benefits of peer-to-peer networks such low users into the network, since otherwise, a proliferaas anonymity and fault-tolerance to attacks. When these tion of malicious users would compromise the network.
centralized sites are attacked, outages severely affect net- In building TorrentTrust, we have attempted to do away
1
with this requirement in order to make the system fully
decentralized. Rather than relying solely on users’ voting
history correlation, we incorporate measures of the users’
trust as well. This helps mitigate the issue of potentially
having many malicious users in the network; creating a
user is easy with no central server, but it is still difficult
to gain the trust of a legitimate user.
3
3.1
would be derived. This makes the bootstrapping of new
peers difficult as a zero correlation causes votes to be
discarded entirely. Similar to Credence, we adopt a default correlation coefficient for new users so they will be
able to participate in the network, although perhaps more
cautious. We choose this value to be below the average
correlation between users existing in the network. This
gives incentive for new users to vote faithfully in order to
get more accurate results.
Object Scoring and Ranking
3.1.2
Scoring Algorithm
The trust coefficient T (u, v) quantifies how much user u
trusts user v. All coefficients are in the interval [0, 1] so
that T (u, v) acts as a scaling factor in the score calculation by reducing the impact of of untrusted users to zero.
One simple trust coefficient, which we denote Tk is
based on the presence of a length-k path between u and
v. If there exists a path between u and v, then Tk = 1;
otherwise, Tk = 0. This has the advantage of being very
easy to calculate with a simple breadth-first search, and
giving users an additional parameter that they may modify for their needs; however, it fails to take into account
the overall shape of the graph.
Another possible trust metric is Eigenvector Centrality.
By examining the graph structure, Eigenvector Centrality measures the influence of a node in a network, much
like the famed PageRank algorithm. The underlying idea
revolves around some transitive trust: nodes with high
amounts of trust will contribute more to giving their close
neighbors amounts of trust. It defends against nodes that
are not well connected to the network that may be malicious. This provides two key advantages to the first
metric. First, it is more informative since it is no longer
just binary trust, but a scale of trust. Second, it provides
more graph coverage: there is now some trust value associated with far more people in the network, which leads
to more objects being classified. The key disadvantage is
the fact this measure is agnostic to the calling user. There
will cause some information loss because of the nature of
this metric.
A user can select which one to use based on his/her
level of paranoia.
TorrentTrust utilizes a score to rank the credibility and
relevance of an object under a certain specified search
query. This concept of using such scores to rank objects
is inspired by the Credence protocol.
In TorrentTrust, the score S of an object is given by
X X
S(Q, h, u) =
C(u, v)T (u, v)A(q, vv )
(1)
q∈Q vv ∈V (h)
where Q represents a query which is made up of a set of
claims about the object being queried for, h is a unique
hash identifying an object in the network, and u is a
unique identity (key) of a user. We define V (h) to be
the set of all votes on the object h, where each containing
votes vv describing a vote cast by user v.
We define C(u, v) to be a correlation coefficient between
user u and v based on their voting histories. T (u, v) is
defined as a trust coefficient between user u and v. These
factors are defined in the following sections. A(q, vv ) is
defined as the compatibility of the claim q with the vote
vv .
3.1.1
Correlation Coefficient
The correlation coefficient C(u, v) describes voting correlation between two peers u and v. Its value is derived from
the voting histories of the two peers and can be positive
or negative.
Since voting correlation is reciprocal, we observe that
a well behaved correlation coefficient must exhibit the
behavior that C(u, v) = C(v, u) ∀ u, v.
A simple way to define the correlation between two
peers is to consider each peer as being represented by a
sparse vector u that concatenates their votes. Since votes
are in the set {−1, 0, 1} (representing bad content, no
knowledge, and good content respectively), each voting
history can be represented by an n-dimensional vector
with each dimension being a vote on a piece of content.
These vectors are extremely sparse in practice since most
users will not have seen most objects in the network.
With this we can then use cosine similarity to calculate
the correlation coefficient between users through
C(u, v) = ~u · ~v
Trust Coefficient
3.1.3
Agreement Indicator
The agreement indicator function A(c, v) defines a function that indicates whether a claim c is compatible with
a vote v (e.g. a claim that the object is ’GOOD’ is compatible with a vote claiming the object is ’GOOD’ and is
incompatible with a vote that claims the object is ’BAD’).
This is a base value that decides whether the vote itself
contributes positively or negatively to the query and does
not depend on trust or correlation metrics. Agreement in(2) dicator functions must meet the following requirements:
• When the vote v does not contain the property that
claim c is making a claim on, then the indicator is
One observation we may make is that if two users show
no common voting history, a zero correlation coefficient
2
A(c, v) = 0. We say that the vote v is unrelated to the clienet for “high-level” operations, such as retrievthe claim c.
ing users and vote histories; and 2) a distributed API
layer, used by the DAO layer, which provides persistence
• When the vote v makes a claim such that v =⇒ c, in through a familiar associative map-like interface.
then A(c, v) > 0. We say that the vote v is compatible with the claim c.
• When the vote v makes a claim such that v =⇒ ¬c
then A(c, v) ≤ 0. We say that the vote v is incompatible with the claim c.
In the TorrentTrust protocol, agreement of claims is
calculated by each individual peer and thus each peer may
choose their own notion of agreement so long as it follows
the requirements outlined above. Thus our definition of
agreement is
4.1 Client methods


1
(v
=⇒
c)

Users of TorrentTrust are identified throughout the netA(c, v) = −1 (v =⇒ ¬c)
(3) work by the hash of their public key. Users have the


0
(othewise)
ability to cast votes, which are a list of assertions about
a particular object, identified by its content hash. An asThis implies that unrelated claims do not contribute sertion is a claim about a particular property of an object.
to the overall score and that a vote initially contributes A user’s public key hash is associated with a collection of
positively if it is positive and vice versa. For example, a votes and a collection of public key hashes identifying
positive agreement paired with positive trust and corre- other users which are trusted by that user. An object
lation values contributes positively to the score.
hash is associated with the collection of users that have
In our implementation we only have one property being voted on the object. A tuple of an object hash and public
a claim on whether the file is legitimate. However, it is key hash is associated with the voting history of user.
simple to extend this to support multiple claims.
All mutable operations on the network are cryptographically signed to assert the identity of the user that
originated the request. DHT messages are verified on
3.1.4 Ranking of Scored Results
each node to ensure validity.
The score achieved from the TorrentTrust scoring algorithm can be used to evaluate both the relative and ab4.2 Distributed methods
solute reputation of items.
In the simplest case, TorrentTrust is used to evaulate
The interfaces for the distributed API layer encapsulate
a list of results. We can use our proposed scoring algooperations on the DHT. Usage is similar to a standard
rithm to rank items the list where high reputation objects
associative map interface, except everything is written
are displayed earlier in the list. This gives users a good
to be completely non-blocking. This layer also resolves
evaluation of relative reputation of a group of results (e.g.
collisions in the key space.
produced by seaching on a torrent index).
Since many computations in TorrentTrust require
The scores themselves may also be used to evaluate
knowledge of a large part of the peer network graph,
absolute reputation scores where the user may choose to
so the performance of querying the network becomes a
hide items below a certain cutoff. For the binary case
significant consideration, especially when searching for
where we only make a single claim of trustworthiness or
trusted users. The distributed API layer is written to
not, we can use a cutoff of 0 where content below this
be agnostic to caching and other cross-cutting concerns,
value is deemed to be malicious.
and so, presents a uniform interface to access keys and
multi-values. In our testing, we found that it was quite
straightforward to replace the DHT implementation with
4 Implementation
a centralized database implementation.
At a high level, TorrentTrust operates on top of a
distributed hash table (DHT). Votes for objects and 4.3 Local client
pseudonymous identities are stored in the DHT. The underlying DHT implementation is TomP2P (Bocek, 2009), We chose to present the TorrentTrust APIs (those defined
which allows multiple values to be associated with a single in subsection 4.1) through a locally running Jetty web
key. There are two abstraction layers developed in this server. This server serves both API endpoints that alproject: 1) a data access object (DAO) interface used by low interaction with the TorrentTrust client methods and
3
also provides a simple user interface for users to perform with honest users very close to 1. This is a serious vulneroperations using the TorrentTrust system.
ability of the Credence system, as our simulations show.
It is quite simple for an attacker to create multiple key
pairs and then use each one to vote up a piece of content
which is actually malicious (a virus). Since there will
initially be no votes on this malicious object, these fake
users are essentially Trojan horses, posing as legitimate
users by voting up legitimate content in addition to the
virus. While not fully immune, TorrentTrust is resilient
to this type of attack because the trust rating for each
user is independent of the voting history. It is easy to
The local client we implemented supports various fea- create many fake accounts, but it is significantly more
tures external to the basic protocol of TorrentTrust such difficult for a fake account to convince a legitimate user to
as the ability to keep track of multiple identities and man- add their key to their trusted key set. If a fake account is
agement of the cryptographic keys. This makes it very outside of the user’s network, their trust coefficient will be
simple and secure to develop alternate software that takes zero, and it will not matter how correlated their votes are
advantage of the TorrentTrust system (such as future in- - they will not contribute to the trust score of a queried
tegrations with BitTorrent clients).
object.
In reality, it may be possible for an attacker to convince a legitimate user to “trust” a Trojan horse account.
5 Security
However, there are several factors which can mitigate the
damage from such a breach. First, trust metrics that limit
5.1 Security Properties
the breadth of the trusted network (such as Tk , discussed
Because TorrentTrust lacks a central authority, some at- in subsubsection 3.1.2), can limit the number of users that
tacks to which Credence was vulnerable are not effective are affected by the breach to only the immediate network
on TorrentTrust. In particular, there is no risk of a hacker of the fooled user.
There are other suitable choices of trust metric that
gaining access to the central authority, or otherwise comay
defend against this attack. While we study the BFS
ercing the central authority into granting accounts to fake
approach
in depth, we also considered various measures
users.
of
centrality.
Even if a good user is fooled into trusting a
However, we still need to worry about other possible
malicious
user,
a malicious user shouldn’t be very central
attacks. We can break these down into three major types:
in
the
network.
There will not be very much information
attacks on the correlation metric, attacks on the trust
flow
through
them,
particularly if they are only connected
network, and attacks on the underlying DHT itself. We
to
their
own
malicious
clique.
examine each of these separately.
5.1.1
5.1.3
Resistance to Voting Spam
Rather than trying to subvert the system itself, malicious
nodes could potentially try to subvert the DHT. We mitigate this by signing all information in the DHT with the
corresponding owner’s key.
In our implementation, votes, user profiles, and voting
histories are signed with the 2048 bit RSA private key of
the voter, who keeps this key private. Thus, it is computationally infeasible for any node in a DHT to falsely
report votes on an object. Our client validates incoming data synchronizations to the DHT, which means that
it is not possible to promote an invalidly signed object
and have it be accepted by legitimate copies of our implementation. We augment this with setting redundancy
in the DHT so that attackers would not be able to control a certain section of the keyspace. This guarantees a
certain level of availability and prevents attacks where an
attacker gains control over a certain range of hashes and
is not able to provide valid signatures (clients would not
be able to get meaningful responses as invalid responses
are dropped).
Since TorrentTrust borrows the correlation coefficient approach of Credence, it enjoys the same resistance to spammers - users with random voting histories. In expectation, spammers have the same number of similar and dissimilar votes, so the correlation coefficient tends to be
close to zero. Also similarly to Credence, we find that
trolls - users that always vote unfaithfully - actually tend
to help honest users make better decisions, because the
correlation metric is close to −1 (meaning their votes are
counted with the opposite sign). We tested these assumptions in our simulations and achieved results similar to
Credence.
5.1.2
Attacks on the underlying DHT
Resistance to Malicious Users
The most effective means of taking advantage of the correlation coefficient seems to be just one aspect of a trust
network attack. If an attacker wants to promote a new
piece of malicious content, then by voting faithfully on
other pieces of content they can push their correlation
4
Malicious nodes may attempt a replay attack by reporting a validly signed old voting record for a user (for
example, the empty set, indicating that user never voted).
To combat this, we use a combination of replication (the
same information is stored at several nodes) and sequence
numbers (a user’s voting record includes a monotonically
increasing integer, which is incremented each time their
vote set is updated). A client can always trust the vote
set with the highest sequence number, which makes it
harder to carry out this type of attack.
The DHT is still vulnerable to Sybil attacks, as are
all DHT implementations. Judging from the success of
BitTorrent (which is also based on a DHT), these types
of attacks are relatively rare in practice, particularly as
the number of honest nodes in the DHT grows large. Also,
following the logic above, it would still be impossible for
an attacker to forge votes; the worst they could do in a
Sybil attack would be to report empty or outdated voting
histories.
focused on were the BFS approach with varying depth,
and eigenvector centrality.
The simulations produce some key metrics for evaluating our performance. These metrics are:
• Coverage - The percentage of all content that a
node can classify at all.
• Error rate - The percentage of all content classified
incorrectly.
• False positive rate - The percentage of malicious
content classified as good.
• False negative rate - The percentage of good content classified as malicious.
Where appropriate, we also examine aggregate trust metrics to determine whether honest nodes are trusting malicious ones.
6.2
6
6.1
Evaluation
Baseline Simulation
In order to establish a baseline for other simulations, we
first performed a simulation of the Credence system with
1,000 honest users and 1,000 malicious users. A user is
considered to trust a piece of content if the trust score
S > 0. The results are summarized in the table below.
% of Virus misclas- Content
Scenario
sification
Unclassified
BFS depth 1
0.0%
98.8%
BFS depth 2
0.0%
92.9%
BFS depth 3
0.0%
76.8%
BFS depth 5
0.0%
46.4%
BFS depth 7
0.0%
44%
BFS depth 12
0.0%
43%
BFS depth ∞
99.3%
17.1%
Even with no connections between the honest users and
the malicious clique, 99.3% of users classified the virus
as “good”. This is due to the fact that Credence is
equivalent to TorrentTrust if T (u, v) = 1 ∀ u, v - i.e.,
every user trusts every other user completely. Note that
this isn’t exactly an extension of BFS since the network
is not necessarily connected, but this models trusting
everyone in the network. This illustrates the fact that
this attack is a serious vulnerability for Credence.
Also as a baseline, we tested Tk , the trusting all users
within k steps of the querying user in this no-maliciousconnections environment. As expected, the presence of
the trust metric protects honest users from the malicious
cliques; however, the coverage decreases since there are
less nodes in consideration.
Simulation Setup
Before implementing the full distributed system, a simulation bench was created to analyze our algorithms. We
created a model to reflect what the network might look
like.
We create a number of nodes in the network, classified
either as good voters, spammers, or malicious. Good voters vote faithfully, which means their votes should correlate strongly, and reflect the truthfulness of content.
Spammers vote randomly, and malicious nodes vote faithfully except on a specific piece of bad content (the virus).
We have two setups used to run simulations. The first
is created by generating k clusters of n good nodes where
each node in a cluster is connected to at least one other
node in a cluster. We then generate malicious cliques
which each select a piece of content as their virus and
vote accordingly. In the simulation, we can control the
connectivity of the malicious cliques to good users.
The second setup uses a social network graph generator called GDBench (Angles, Prat-Perez, and DominguezSal, 2013). The generator creates a graph using the power
law, which has been shown to be an accurate representation of typical social networks. Once we import the graph,
we follow a similar procedure to the first setup. We generate a certain number of malicious cliques and randomly
connect them to the good network. Each clique has their
own virus.
These setups are mainly to evaluate the trust metrics.
As such, we didn’t focus too much on spammers because
6.3 Trojan Horses
the correlation coefficient should protect users from people who vote randomly. The quality of the trust metric We now set out to test our system’s response when lewill be the main factor in preventing malicious users who gitimate users were tricked into trusting malicious users
attempt to infiltrate the network. The two metrics we (”Trojan horses”). We modified the simulation to add a
5
different identities as they please, making it difficult to
deanonymize the network. TorrentTrust stores much less
potentially-sensitive user data than, say, private trackers
(which do require an email/password). With data security in the forefront of the news today, these are clearly
benefits.
However, despite the benefits, it would be remiss not
to examine potential drawbacks. One thorny issue which
remains is legality - torrents are often used to download
media illegally, and some of the same aspects of the system that we discussed in the previous section (privacy,
and decentralization) may make it more difficult for law
enforcement authorities to track down copyright infringement.
Overall, TorrentTrust is a tool like any other, and
should be treated as such. It seems clear that there are
legitimate uses for torrents (e.g. Linux images, LATEX distributions), and it seems reasonable that users ought to be
able to access these legitimate torrents safely. Thus, we
believe that the positive ethical aspects of TorrentTrust
outweight the potential drawbacks.
variable number of random edges between the malicious
cluster and the network of legitimate users. We then collected data on our four metrics for various trust metrics:
k-length path metric for k = 1, 2, 3, 5 and Eigentrust. The
results are shown in the following graph:
We can see that there is a tradeoff between coverage
and error rate, at least for the k-length path metric. This
makes sense, since limiting the actors you consider makes
it less likely that any one of them will have seen a piece
of content, but also makes it less likely that one of the
trusted actors is actually a malicious node. It does not
look like a linear tradeoff, which means a user might want
to be informed about the shape of the curve when selecting their depth preference.
One interesting result of this data is the relationship between the data when the malicious cliques are connected
with 10 edges versus 20 edges. There does not seem to
be a significant difference, despite there being double the
connections. This may indicate that this metric is robust
against infiltration, even as the malicious users are better
able to connect themselves.
We also ran experiments with varying amounts of malicious users in the network. We still seem to be
Eigentrust did not perform as well as we had hoped. It
is very likely that the information loss by not taking into
account the calling user in the metric. When examining the numbers, there was no significant score difference
between content we knew to be good, and viruses.
8
Conclusion and Future Work
In conclusion, we see that our inclusion of the trust metric
achieves better performance in protection against attackers targeting trust network structure at the tradeoff of
lower ability to classify items. We believe this to be a
reasonable tradeoff, since in many real world cases the
benefits of better protection outweigh the cost in coverage.
Our implementation also suggests that a system need
not be centralized in order to provide object rankings.
TorrentTrust relies solely on a DHT, but through the use
of crypography, we can achieve all the required security
properties of the system. While there were a few quirks
we needed to work around due to our choice of TomP2P
as the backing implementation, we think that the generic
API it provides is representative of functionality offered
by a wide variety of DHT implementations and our system is easily adaptable to any specific implementation.
In the future, it would be interesting to experiment
with a few more trust metrics. By examining more complex trust metrics, we may be able to achieve a better
tradeoff between the content classification rate and trojan horse success rates. There are further opportunities
7 Ethical Considerations
for future work with respect to ease-of-use; building the
TorrentTrust client into an existing popular BitTorrent
As with most security-related project, there are some client would make it easy to get started with the system.
ethical questions related to TorrentTrust. On the one Finally, we would like to create a mobile app to add friend
hand, there are many positive properties related to user on the go, since TorrentTrust is not very useful without
safety. Users who use TorrentTrust are less likely to down- a sizable trust neighborhood and it is much harder to get
load a malicious virus, as our simulation shows; this is someone’s entire public key than to simply scan a code
clearly a positive development. Likewise, there are pos- on their phone.
itive aspects related to privacy: users never need to associate an email or other personally identifying information with their public keys, and they can have as many
6
References
Angles, Renzo, Arnau Prat-Perez, and David DominguezSal (2013). “Benchmarking database systems for social
network applications”. In: First International Workshop on Graph Data Management Experiences and Systems. ACM.
Bocek, Thomas (2009). TomP2P-A Distributed Multi
Map.
Cornelli, Fabrizio et al. (2002). “Choosing reputable servents in a P2P network”. In: Proceedings of the 11th
international conference on World Wide Web. ACM,
pp. 376–386.
Walsh, Kevin and Emin Gun Sirer (2005). Thwarting
p2p pollution using object reputation. Tech. rep. Cornell University.
7