Co-evolution of network
structure and content
Lada Adamic
School of Information & Center for the Study of Complex Systems
University of Michigan
Outline
Co-evolution of network structure and content
Can the structure of Twitter and virtual world interactions
reveal something about their content?
http://arxiv.org/abs/1107.5543
Can the structure of a commodity futures trading network
reveal something about information flowing into the market?
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=136
1184
What is the relationship between network
structure and information diffusion?
3
Is information flowing over the
network?
Or is information shaping the
network?
Can the shape of the network reveal
properties of information
Big news! Giant microbes!
Can the shape of the network reveal
properties of information
Little news. How’s the weather?
Related work on time evolving graphs
Densification over time (Leskovec et al. 2005)
Community structure over time (Leicht et al. 2007, Mucha et al.
2010)
Change in structure (ability to “compress” network) signals
events (Graphscope by Sun et al. 2007)
Disease propagation & timing (Moody 2002, Liljeros 2010)
Enron email (B. Aven, 2011)
What’s different here
We look at network dynamics at relatively short time
scales and construct time series
A range of network metrics, instead of just community
structure
Information novelty and diversity as opposed to tracking
single events / pieces of information
Can the network reveal…
If everyone is talking about the same thing, or if there is just
background chatter.
If what they are talking about is novel?
1st context: virtual worlds
Networks: asset transfers (gestures, landmarks) and
transactions (e.g. rent, object purchases)
Content: assets being transferred
10
Study transfers in the context of 100
groups with highest numbers of
transfers
11
Second context: Twitter Network
microblogging : < 140 characters / tweet
Network links read from tweets
Reply or mention: by putting the @ in front of
the username
Retweet: repeat something someone else wrote on twitter,
preceded by the letters RT and @ in front of their username
Selecting Twitter communities to track
http://wefollow.com/twitter/researcher
For each “researcher” gather tweets of accounts they
follow
Highly dynamic networks
repeated
of edges
0.10
0.15
0.20
SecondLife
Twitter
Segmentation:
Twitter: every 800
tweets
%
0.05
1
2
3
4
5
6
7
Segments
# of segment elapsed
8
median segment
duration 1.5 days
SecondLife: every
50 asset transfers
0.00
percentage
0.25
median segment
duration 8.4 days
Conductance:
capturing potential for information flow
A
A
B
low
conductance
A
B
B
medium
conductance
high
conductance
wkl
Cij = å
Õ deg(k)
paths _i _ j edges _ k _ l _ on _ path
Temporal conductance (summed over all pairs):
High if pairs of nodes share edges, or many short,
indirect paths
Koren, North, Volinsky, KDD, 2006
Network expectedness
Define expectedness:
Average conductance of all neighbor pairs at time t,
based on conductance of pair at time t-1
1
Xt =
Et
å
edges(i, j )
C i,t-1j
expected
unexpected
16
network
configuratio
n at
t=0
conductance = 4
possible
configuratio
ns at
t=1
conductance = 4
expectedness = 1.5
edge jaccard = 1
Conductan
ce and
expectedn
ess as a toy
network
evolves
d
conductance = 4.5
conductance = 6
expectedness = 1.3333 expectedness = 0.5
edge jaccard = 0.6667 edge jaccard = 0.25
SecondLife: network structure and
content
standard
network
metrics are
not indicative
of information
properties
overlapoverlapD diversity D diversity
t-1,t
t,t+1 t-1, t
t, (t+1)
conductance
and
expectedness
are
Conductance & diversity of
information
High conductance brings higher
content diversity
Repeat network patterns bring less
diversity and less novelty
but… similarity and novelty are
positively correlated (r = 0.19)
Social and transaction
network of top sellers in
SL
Twitter: textual diversity and novelty
Semantic metrics
Metric Type
Computation Methods
between connected node pairs in
the graph
Contemporary Metrics
(average cosine
similarity of words in
Tweets)
between indirectly-connected node
pairs, i.e., non-neighbors with an
undirected path of length > 1
between them
between isolated pairs (in different
components)
Novelty Metric
(Language Model
distance)
between two sets of tweets
associated with Twitter networks
captured at different times
network structure
Twitter: network structure and
information diversity
# nodes(T)
-0.584 ! ! !
-0.632
0.305 ! ! !
0.030 ! ! !
# edges(T)
-0.537 ! ! !
-0.601
0.348 ! ! !
0.058 ! ! !
0.6
0.4
reciprocity(T)
-0.160 !
-0.179 !
0.176 ! !
0.128 !
clustering coef.(T)
-0.198 ! !
-0.240
0.181 ! !
0.030 ! ! !
centralization(T)
-0.121 !
-0.176
0.158 ! !
0.062 ! !
edge deg cor.(T)
0.027
-0.155 ! !
0.113
0.054 ! !
av. degree(T)
-0.287 ! ! !
-0.353
0.323 ! ! !
0.093 ! ! !
sd. degree(T)
-0.212 ! !
-0.277
0.251 ! ! !
0.048 ! !
WCC size(T)
0.317 ! ! !
0.303
-0.126 ! !
0.038 ! ! !
conductance(T)
-0.444 ! ! !
-0.506
0.369 ! ! !
0.121 ! ! !
expectedness(T)
-0.145 ! !
-0.161 !
0.234 ! ! !
0.092 ! ! !
all-pairs
unconnected indirectly-connected connected
content similarity
0.2
0.0
-0.2
-0.4
-0.6
Inferring Network Semantic
Information
Question: Does the network structural information help to
improve the prediction performance of the
characteristics of information exchanged?
Semantic
variables
Topological
variables
Kernel
Regression
Prediction
Model
Semantic
variables
Example: Inferring the average
similarity score between isolated pairs
0.8
0.6
0.4
2
R in predicting the
ASS between isolated nodes
1
0.2
Q
c :X ={connected}
1
1
c2:X2={indireclty−connected}
c :X ={# nodes}
3
3
c :X ={# edges}
4
4
0
.
.
s
s
s
y
ted cted ode dge ocit coef ation cor Deg Deg Size ance nes
c
v
d
r
g
t
e ne n
r
d
t
C
e
t
a
c
p
g
n
e
s C u cte
n
n
#
# eci rin cen e d
W o n d pe
r te n
co −co
g
s
o
d
c ex
y
l
u
c
e
t
l
c
ec
r
i
ind
The input variables of curve ci start from Xi
and increase each time by adding the
variable labeled on x-axis.
Don’t need to use
other textual variables
(e.g. similarity between
indirectly connected
pairs) when sufficient
topological information
available
Reason: topological
variables account for
much of the pattern in
the text!
Network structure and information
novelty
Greater novelty in
edges
# nodes(T-1,T)
corresponds to
# edges(T-1,T)
greater novelty in reciprocity(T-1,T)
content shared clustering coef.(T-1,T)
centralization(T-1,T))
For nodes that are edge deg cor.(T-1,T)
interacting (citing av. degree(T-1,T))
or being cited):
sd. degree(T-1,T))
WCC size(T-1,T))
Higher
edge jaccard(T-1,T)
conductance
and
conductance(T-1,T)
expectedness
expectedness(T-1)
correlates with less expectedness(T)
information
novelty
0.3
0.124 !
-0.050 ! !
0.171 !
-0.117 ! ! !
0.042
-0.004
0.149 !
-0.197 ! ! !
-0.018
0.038
-0.111 ! !
0.101 !
0.066
-0.044 !
0.083
-0.119 ! !
0.085
-0.101 !
-0.233 ! !
-0.230 ! ! !
0.202 !
-0.225 ! ! !
0.171 ! !
-0.143 ! !
0.093 !
-0.273 ! ! !
0.2
0.1
0.0
-0.1
-0.2
-0.3
LMdist_allNodes(T-1,T)
LMdist_NodesWithNeighbors(T-1,T)
Information in trading networks
CFTC = Commodity futures trading commission
stated mission: protect market users and the public from
fraud, manipulation, and abusive practices
futures contracts started out as contracts for agricultural
products, but expanded to more exotic contracts,
including index futures
Collaboration with Celso Brunetti, Jeff Harris, and Andrei
Kirilenko
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=
25
Data
6.3 million transactions in Aug. 2008 in the Sept. E-mini S&P
futures contract
price discovery for the index occurs mostly in this contract
(Hasbrouck (2003))
data includes: date & time, executing broker, opposite broker,
buy or sell, price, quantity
sample
in broker
transaction windows of 240 transactions
executing
opposite broker
quantity: 10
price: $171.25
matching algorithm
sell 10 contracts at $171.25
buy 20
30 contracts at $171.50
$171.25
sell 5 contracts at $171.75
buy 30
20 contracts at $171.25
$171.50
sell 20 contracts at $172.00
buy 50 contracts at $171.00
limit order book
27
not social, not intentional, not
persistent
28
Financial variables
Rate of return:
Last price to first price in logs (close-to-open)
Volatility:
Range – log difference between max and min price
Duration:
start
Total period duration - time in seconds between the
and end of each sampling period
Proxy for arrival of new information
Volume:
Trading volume – number of contracts traded
What can we learn from network
structure?
e.g. centralization?
low in-centralization
high in-centralization
low outdegree
low indegree
high outdegree
high indegree
30
overview of network variables
# nodes, # edges
clustering coefficient, LSCC, reciprocity
CEN = giniin-degree – giniout-degree
INOUT = r(indegree of node, outdegree of same
node)
AI (asymmetric information)
31
Correlations between network
and financial variables
High Centralization: market dominance - a dominant trader buys
from many small sellers – low duration, low volume
Correlations between network
and financial variables
Negative assortativity: large sellers sell to small buyers and vice
versa
– low duration, higher volume
Correlations between network
and financial variables
High av. degree & largest strongly connected component:
no news - many buyers and sellers – high duration, high volume
Correlations between network
and financial variables
Rate of return:
positive correlation with centralization
Volatility & duration:
Volume:
correlated with standard deviation of degree, average
deg. and the total number of edges (E).
Correlated with a few network variables, sign varies.
Conclusion
Network structure alone is revealing of the diversity and
novelty information content being transmitted
Results depend on the scope and relative position of the
activity in the network
Future work
Sensitivity to inclusion of non-interactive or across-community
interactions
Applying novelty & conductance metrics to financial time series
Continuous formulation of novelty and other network metrics
(because segmentation is problematic)
Roles of individual nodes
Thanks:
Edwin Teng
Liuling Gong
Avishay Livne
Information network academic research center
Questions?
© Copyright 2026 Paperzz