On Trusting Introductions from a Reputable Source: A Utility

On Trusting Introductions from a Reputable Source:
A Utility-Maximizing Probabilistic Approach†
Richard Al-Bayaty∗ , Patrick Caldwell and O. Patrick Kreidl
School of Engineering, University of North Florida, Jacksonville, Florida U.S.A.
Email: {dick.al-bayaty,patrick.caldwell,patrick.kreidl}@unf.edu
Abstract—In many network environments, a node seeks to
expand the number of connections to other nodes who can
make valuable transactions possible. In the possible presence
of misbehaving nodes, who make harmful transactions possible,
each (behaving) node must use discretion on whether to accept a
prospective connection. We study this problem when the mechanism for expansion is an introduction-based reputation protocol
[1]: a prospective connection between two nodes is offered only
by way of introduction via a third party already connected to
both nodes, and each node’s accept/decline decision is driven by a
privately-managed pool of reputations. Our probabilistic analysis
exposes a delicate interplay between the policies for managing
established and prospective connections as well as assumptions on
adversary behavior and information decentralization. Our results
are illustrated by examples.
I. I NTRODUCTION
Many aspects of human society involve individuals forming reputations of others via a ”word-of-mouth” mechanism,
striving to preserve the value of transactions between behaving
parties while curbing the harm of transactions with misbehaving parties [2]–[5]. Familiar modern-day Internet-based
instantiations of such mechanisms include Ebay’s “feedback
forum,” the “Web-of-Trust” browser plug-in and “Angie’s
List.” While such mechanisms may transform their peer-topeer ratings into a collection of reputations differently, they all
depend upon a central trusted authority (i) to manage the pool
of reputations across all established connections and (ii) for
any prospective connection to inform each party about the
other’s current reputation. The intent is to use this reputationbased signaling of trustworthiness to discourage interactions
with parties that repeatedly misbehave and, in the long run,
positively affect the quality-of-service for behaving parties.
In networks for which a central reputation authority becomes too costly or altogether infeasible, a recently proposed
decentralized trust management scheme (described in [1] for
secure Internet packet routing) is a so-called introductionbased approach. Fundamental to such an approach is that every
party maintains full discretion over whether to continue or
close any of its established connections, while a prospective
connection between two parties, or nodes, is offered only by
way of introduction via a third party (i.e., the introducer). Each
† Work supported by the Air Force Research Laboratory (AFRL) under
contract FA8750-10-C-0178. The views expressed are those of the authors
and do not reflect the official policy or position of the Department of Defense
or the U.S. Government.
∗ Richard Al-Bayaty is now in the Department of Electrical and Computer
Engineering at the University of Florida. Email: [email protected]
Fig. 1.
A Prospective Connection in an Introduction-Based Protocol [1]
node can accept or decline the introduction (based on its trust
in the introducer), and only if both accept will the prospective
connection be established for the two nodes to transact or
spawn new introductions (see Fig. 1). Note that, depending on
the state of all nodes’ connections, forming a new connection
may require multiple consecutive introductions; moreover, it
is also assumed that the network initializes with every node
having at least one a-priori connection in place.
The utility of each connection with a behaving node is the
sum-reward of (always non-harmful) transactions, but each
connection with a misbehaving node yields negative utility due
to the risk of harmful transaction (and because non-harmful
transactions with misbehaving nodes have zero reward). At
the heart of the problem for each node is that the true
behavior of any other node cannot be known with certainty
but is rather summarized through its reputation. However, as
a consequence of there being no central reputation authority,
every node must manage its own private pool of reputations
to drive the different decisions (e.g., whether to forcibly
close an established connection, whether to accept an offered introduction), including how to evolve those reputations
based on evidence from only its active connections. These
decisions local to each node are prescribed by a so-called
policy, which for the encouraging simulation results reported
in [1] was selected somewhat ad-hoc i.e., a set of heuristic
reputation management rules with parameters tuned via a timeconsuming simulation-based search. This paper describes a
model-based optimization approach to this policy selection
problem, specifically characterizing for each node
• the continue vs. close decision for any established connection as a variant of the Sequential-Probability-RatioTest (SPRT) solution to the Wald problem [6], [7];1 and
1 This
result was first presented in [8] and more fully developed in [9].
the accept vs. decline decision for any offered introduction as a specific reputation initialization rule on the
prospective connection.
The primary contribution here is how the above solution
components tie together at each node to promote a constructive
interplay among the two different decision processes, striving
to maximize each node’s total utility over all connections.
•
II. M AIN M ODELS AND S UMMARY
OF
R ESULTS
In the full context of the introduction-based protocol [1],
the model considered in this paper takes the perspective local
to one node and focuses on the decision processes from when
an introduction is offered over an a-priori connection to when
the resulting connection (if accepted) is closed. The implementation of these two decision processes is illustrated in Fig. 2,
indicating the interplay of the accept vs. decline decision about
a prospective connection with the continue vs. close decisions
about established connections. Subsection II-A and Subsection II-B characterize these two decision processes within
a utility-maximizing probabilistic framework, drawing from
previously-published work [8], [9] for the continue vs. close
decision.2 The common thread in all subsections is upholding
a correspondence between evolving a reputation and revising
the probability of misbehavior conditioned on new evidence,
which is formalized in the following definition (from [9]).
Definition 1 (Remote Node’s Reputation): Given a probabilistic model that jointly defines a (hidden) binary-valued
state variable X i , indicating whether remote node i is misbehaving, and a vector Yn of random variables to be observed
across all connections up to and including time period n, the
evolving reputation of node i is
i exp Rni
pn
i
i
⇐⇒ pn =
Rn = log
1 − pin
1 + exp (Rni )
Fig. 2. The accept vs. decline decision process and its relation to the continue
vs. close decision process in an introduction-based reputation protocol [1].
of whether each incoming transaction is a harmful attack and,
in turn, the remote node’s reputation is decremented by RDEC
or incremented by RINC , respectively. The updated reputation
Rn (omitting the superscript notation in Definition 1) is then
compared to a threshold RTHR to answer the question of
whether the remote node is misbehaving, closing or continuing
the connection if the answer is “yes” or “no,” respectively.
As fully developed in [9], this decision process can be cast
as a variant of the well-studied Wald problem [6] with the
following model assumptions:
•
•
pin
where
denotes the conditional probability that node i is
behaving given the evidence Yn = yn realized through time
period n i.e., in probabilistic terms, the evolving reputation is
in correspondence with the posterior behaving probability
(1)
pin = P X i = 0 | Yn = yn .
Observe that as the behaving probability approaches unity
(zero), reputation approaches positive (negative) infinity. (The
superscript notation can be suppressed when the remote node
in question is clear from context.)
A. Continue vs. Close Decision Process
The continue vs. close decision process takes the perspective
of one node over the lifetime of a connection to another
node; two such instances are illustrated in Fig. 2, one for
the a-priori connection to remote node A over which an
offered introduction to remote node B is accepted. In the
simulations described in [1], along any established connection
an (imperfect) detector reports a “yes” or “no” to the question
2 Fig. 2 also makes reference to “reputation coupling” and “status feedback,”
essential parts of the introduction-based protocol [1], but their analysis is
outside of this paper’s scope and a topic for future work.
•
a misbehaving node attacks in any particular transaction
with probability qT (and does not attack with probability
1 − qT ), independently of the attack sequence over all
previous transactions;
every transaction with a behaving node (event X = 0)
incurs a positive reward v, while with a misbehaving node
(event X = 1) every non-attack transaction incurs zero
reward and every attack transaction incurs a positive cost
c; and
the detector classifies each transaction as “benign” (event
Z = 0) or “harmful” (event Z = 1), misclassifying a nonattack transaction as harmful with false-positive rate qFP
and an attack transaction as benign with a false-negative
rate qFN , independently of the error sequence over all
previous transactions.
Under this probabilistic model, the detector’s per-transaction
error probabilities are
P [Z = z|X = x] =

1 − qFP



qFP
(1
− qT )(1 − qFP ) + qT qFN



(1 − qT )qFP + qT (1 − qFN )
=0
=1
=0
=1
(2)
and the evidence Yn through time period n includes the
observed sequence (z1 , z2 , . . . , zn ). The associated behaving
probability in (1) of Definition 1, starting from a given initial
,
,
,
,
if
if
if
if
x=0
x=0
x=1
x=1
and
and
and
and
z
z
z
z
value p0 = P [X = 0], obeys the recursion

pn−1 (1 − qFP )

, if zn = 0

 f (p
n−1 , 0)
pn =
,


 pn−1 qFP
, if zn = 1
f (pn−1 , 1)
Model Parameters: (qT , qFP , qFN , v, c, δ) = (0.1, 0.01, 0.1, 1, 200, 0.995)
n = 1, 2, . . .
with the denominator in each case following from (2),
f (p, z) = pP [Z = z|X = 0] + (1 − p) P [Z = z|X = 1] .
As discussed in [9], the associated utility-maximizing policy
is a special case of Wald’s Sequential Probability-Ratio-Test
(SPRT) solution combined with Definition 1 to translate the
policy parameters into units of reputation. While the increment
RINC and decrement RDEC are determined in closed-form, the
threshold RTHR is determined by solving a dynamic program,
which requires an iterative computation that also yields the
optimal utility function U ∗ (p0 ) (i.e., the infinite-horizon expected total discounted cost, involving a discount parameter δ).
This precise characterization of a single connection’s utility as
a function of initial reputation R0 ⇔ p0 is useful for the other
decision processes within the protocol, as will be exemplified
in the following subsection.
We close this subsection analyzing a specific example:
Fig. 3 plots the utility of a modeled connection as a function
of initial reputation with each colored curve corresponding to
a different policy. The two baseline policies are the (green)
“never-close” policy, showing maximum utility when connected to a highly reputable node but then quickly losing utility
(below the axes shown) as reputation decreases, and the (red)
“always-close” policy, always showing zero utility. Any reasonable policy for closing the connection as a function of the
observed sensor reports should (i) perform no better (worse)
than the never-close (always-close) policy when the remote
node is truly behaving and (ii) perform no better (worse) than
the always-close (never-close) policy when the remote node
is truly misbehaving. Of course, the most interesting policy
comparisons are in the realistic case of significant uncertainty
about whether the remote node is misbehaving (i.e., when R0
is near zero). The (magenta) “close-on-1-alert” policy is a
heuristic that essentially neglects sensor uncertainty; its gap
from the never-close utility for a highly-reputable node is the
impact of the sensor’s nonzero false-positive rate, while its
gap from the always-close utility for a highly-disreputable
node is the impact of the false-negative rate. The (black)
optimal threshold policy, which appropriately accounts for
sensor uncertainty, not only closes the gaps observed in the
heuristic close-on-1-alert policy but also achieves higher utility
than the heuristic policy over all initial reputations.
B. Accept vs. Decline Decision Process
Having characterized the utility-maximizing continue
vs. close policy of any modeled connection as a threshold
rule on reputation, deciding whether to accept or decline an
offered introduction can be equated with deciding whether to
continue or close the prospective connection, or the connection
that would be established if the offered introduction is in
Fig. 3.
A comparative analysis of the optimal continue vs. close policy
fact accepted. In other words, upon modeling the prospective
connection and computing its associated SPRT solution, the
offered connection should be accepted only if the initial reputation is above the optimal threshold. Thus, the key additional
degree-of-freedom in this accept vs. decline decision process
is the reputation initialization rule, which ought to reflect the
reputation of the (already-connected) introducer.
Definition 1 is applied to the following probabilistic model
assumptions, stated for the scenario depicted in Fig. 2 in which
an introduction to Node B is offered by Node A:
A
• every offer from a misbehaving introducer (x = 1) is
to a presumed misbehaving transactor with probability qI
(and to a presumed behaving transactor with probability
1 − qI ), independently of the transactor types across
all previous offers, while every offer from a behaving
introducer (xA = 0) is to a presumed behaving transactor;
• every offer is accompanied by the introducer’s suggested
initial reputation R̄B ⇔ p̄B for the prospective transactor,
which is always selected truthfully by a behaving introducer but can be selected arbitrarily by a misbehaving
introducer; and
A
• the current reputation R
⇔ pA of the (already connected) introducer is locally available.
Applying the law of total probability, we can express the
initialization rule by
pB0 = p̄B pA + 1 − pA (1 − qI ) ,
which in terms of reputations via Definition 1 corresponds to
(
)
A
B
B
exp
R
+
R̄
+
(1
−
q
)
exp
R̄
I
R0B = log
. (3)
1 + exp (RA ) + qI exp R̄B
Clearly this setup couples the reputation of remote node B to
that of its introducer A, but a precise characterization of the
impact of future evidence on both reputations is beyond this
paper’s scope and a topic for future work.
Fig. 4 illustrates the above accept vs. decline decision
process for the same connection model analyzed in Fig. 3,
considering two different values for attack probability qI of
the misbehaving introducer. Fig. 4(a) illustrates the case of
a patient misbehaving introducer with qI = 0.2, meaning
roughly that one out of five introductions is to a presumed
misbehaving node, while Fig. 4(b) assumes an aggressive
Model Parameters: (qT , qFP , qFN , v, c, δ) = (0.1, 0.01, 0.1, 1, 200, 0.995)
(a) qI = 0.2
(b) qI = 0.8
Fig. 4. An analysis of the optimal accept vs. decline policy, exhibiting (a) an
“accept by default” posture assuming a patient misbehaving introducer or (b) a
“decline-by-default” posture assuming an aggressive misbehaving introducer.
introducer with qI = 0.8, or roughly that four out of five
introductions are to presumed misbehaving nodes. The surface
of each 3-D plot shows the initial reputation R0B for remote
node B given by (3) as a function of the locally-available
introducer’s reputation RA and the supplied reputation R̄B for
the introducee. Also shown on each plot is the optimal accept
vs. decline threshold RTHR in the prospective connection,
which when compared against the surface defines a contour
that partitions the 2-D space of reputation pairs (RA , R̄B ) into
an accept region and a decline region. We see that when the
introducer is very trustworthy, the offer is accepted or declined
primarily based on the supplied introducee’s reputation; moreover, when the supplied reputation for the introducee is poor,
the offer is declined regardless of the introducer’s reputation.
However, in the case of a distrusted introducer presenting
a reputable introducee, the decision to decline or accept is
determined by the non-trivial interplay between the utility
of the prospective connection (as reflected in the threshold
RTHR ) and the anticipated aggressiveness (or lack thereof) of
the misbehaving introducer (as reflected by probability qI ).
Specifically, when misbehaving introducers are taken to be
patient as in Fig. 4(a), distrusted introducers are given the
benefit of the doubt and their highly-reputable offers are accepted. In contrast, when misbehaving introducers are taken to
be relatively aggressive as in Fig. 4(b), even highly-reputable
offers from distrusted introducers are declined. That is, the
accept vs. decline decision becomes sensitive to probability qI
only when the introducer becomes distrusted: in the “acceptby-default” cases, the penalty associated with occasionally
accepting connections to misbehaving nodes is worth the
utility of accepting the more common connections to behaving
nodes, while the opposite holds in the “decline-by-default”
cases i.e., the lost utility associated with occasionally declining
connections to behaving nodes is worth the avoided penalty
of declining the more common connections to misbehaving
nodes. Altogether, our solution well-captures the conditions
under which the accept vs. decline policy will maximize total
utility, including the rather subtle consideration of whether to
adopt an “accept-by-default” posture or a “decline-by-default”
posture against reputable offers from a distrusted introducer.
III. C ONCLUSION
Two core decision processes of a recently proposed
introduction-based reputation protocol [1], aiming to retain
the attractive properties of trust systems but without the
assumption of a centralized reputation server, have been modeled within a utility-maximizing probabilistic framework. It
is shown that the decision to accept introductions to new
connections is entwined with the decision process by which
established connections are managed, where the coupling
between the reputations of both introducer and introducee
amounts to maintaining probabilistic consistency across both
nodes’ hidden variables conditioned on evidence from both
connections. This precise characterization alleviates some of
the tax associated with the simulation-based policy optimization approach employed in [1]. Ongoing work includes extension of the model presented here to consider multiple levels
of introductions or multiple introductions by one introducer,
in which case reputation management may be addressed by
inference techniques over Bayesian networks.
An analogous mathematical characterization of the numerous other decision processes among the different roles
within the protocol would be similarly helpful. These include
decisions by introducees on how feedback to the introducer is
generated, decisions by introducers on how such feedback is
interpreted as well as decisions by introducers on whether to
offer a requested introduction in the first place. Understanding
the interplay of all of these different processes with the policy
components developed so far is an open problem. Also of
interest is the impact of richer adversary models, including
strategic adversaries as well as the possibility of collusion
among multiple nodes.
ACKNOWLEDGEMENT
The authors are grateful to Dr. Gregory L. Frazier and Professor Michael P. Wellman for numerous helpful discussions.
R EFERENCES
[1] G. Frazier, et al., “Incentivising responsible networking via introductionbased routing,” in Proc. 4th Int. Conf. on Trust and Trustworthy
Computing, Springer-Verlag, 2011, pp. 277–293.
[2] A. Josang, R. Ismail, and C. Boyd, “A survey of trust and reputation
systems for online service provision,” Decision Support Systems, vol. 43,
no. 2, pp. 618–644, 2007.
[3] Y. Yang, et al., “Defending online reputation systems against collaborative
unfair raters through signal modeling and trust,” in Proc. 2009 ACM
Symposium on Applied Computing. 2009, pp. 1308–1315.
[4] K. Hoffman, D. Zage, and C. Nita-Rotaru, “A survey of attack and defense
techniques for reputation systems,” ACM Comput. Surv., vol. 42, no. 1,
pp. 1:1–1:31, Dec. 2009.
[5] Y. Sun and Y. Liu, “Security of online reputation systems: The evolution
of attacks and defenses,” IEEE Signal Processing Magazine, vol. 29,
no. 2, pp. 87–97, 2012.
[6] A. Wald, Sequential Analysis. New York, NY: Wiley and Sons, 1947.
[7] D. P. Bertsekas, Dynamic Programming and Optimal Control (Vols. 1 and
2). Belmont, MA: Athena Scientific, 1995.
[8] O. P. Kreidl and G. L. Frazier, “A stochastic model for reputation
management in introduction-based trust systems,” in 10th Int. Conf. on
Stochastic Networks, June 2012.
[9] R. Al-Bayaty and O. P. Kreidl, “On optimal decisions in an introductionbased reputation protocol,” in Proc. 38th Int. Conf. on Acoustics, Speech
and Signal Processing, May 2013.