Mathematical Foundations of Social Network Analysis

Mathematical Foundations
of Social Network Analysis
Steve Borgatti
Copyright © 2006 Steve Borgatti
Three Representations
• Graphs
• Matrices
• Relations
Copyright © 2006 Steve Borgatti
Graphs
• Networks represented mathematically as graphs
• A graph G(V,E) consists of …
– Set of nodes|vertices V
representing actors
– Set of lines|edges E
representing ties
• An edge is an unordered pair
of nodes (u,v)
• Nodes u and v adjacent if (u,v) ∈ E
• So E is subset of set of all pairs of nodes
• Typically drawn without arrow heads
Copyright © 2006 Steve Borgatti
Digraphs
• Digraph D(V,E) consists of …
– Set of nodes V
– Set of directed arcs E
Bonnie
Bob
Biff
• An arc is an ordered pair of
nodes (u,v)
• (u,v) ∈ E indicates u sends
Betsy
arc to v
• (u,v) ∈ E does not imply that (v,u) ∈ E
• Ties drawn with arrow heads, which can be in
both directions
Copyright © 2006 Steve Borgatti
Betty
Directed vs undirected graphs
• Undirected relations
– Attended meeting with
– Communicates daily with
• Directed relations
– Lent money to
• Logically vs empirically directed
ties
– Empirically, even undirected relations can
be non-symmetric due to
measurement error
Bonnie
Bob
Biff
Betty
Betsy
Copyright © 2006 Steve Borgatti
Strength of Tie
• We can attach values to ties,
representing quantitative attributes
–
–
–
–
–
–
Strength of relationship
Information capacity of tie
Rates of flow or traffic across tie
Distances between nodes
Probabilities of passing on information
Frequency of interaction
Jane
6
Bob
3
• Valued graphs or vigraphs
Betsy
Copyright © 2006 Steve Borgatti
1
8
Adjacency Matrices
Friendship
Jim Jill Jen Joe
Jim - 1 0
1
Jill 1 - 1
0
Jen 0 1 1
Joe 1 0 1
Proximity
Jim
Jim Jill 3
Jen 9
Joe 2
Jill Jen Joe
3 9
2
- 1 15
1 3
15 3
-
Jill
Jen
1
3
Jim
9
3
Copyright © 2006 Steve Borgatti
15
2
Joe
Walks, Trails, Paths
10
• Path: can’t repeat node
12
11
– 1-2-3-4-5-6-7-8
– Not 7-1-2-3-7-4
8
9
• Trail: can’t repeat line
– 1-2-3-1-7-8
– Not 7-1-2-7-1-4
2
7
3
1
• Walk: unrestricted
– 1-2-3-1-2-7-1-7-1
4
Copyright © 2006 Steve Borgatti
6
5
Things that move thru networks
•
•
•
•
•
•
•
Used goods
Money
Packages
Personnel
Orders
Innovations
Practices
•
•
•
•
•
Gossip / information
E-mail
Infections
Attitudes
Influence?
Borgatti, S.P. 2005. Centrality and network flow. Social Networks. 27(1): 55-71.
Copyright © 2006 Steve Borgatti
Used Goods Process
• Canonical example:
– passing along used paperback novel
• Single object in only one place at a time
• Doesn’t (usually) travel between same pair
twice
• Could be received by the same person
twice
• A--B--C--B--D--E--B--F--C ...
• Travels along graph-theoretic trails
Copyright © 2006 Steve Borgatti
Package Delivery Process
• Example:
– package delivered by postal service
• Single object at only one place at one time
• Map of network enables the intelligent
object to select only the shortest paths to
all destinations
– Travels along shortest paths (geodesics)
Copyright © 2006 Steve Borgatti
Mooch Process
• Examples
– Obnoxious homeless relative who visits for six
months until kick out and moves to next
relative
– Personnel flows between firms
• In just one place at a time
• Doesn’t repeat a node (bridges burned)
– Travels along paths
Copyright © 2006 Steve Borgatti
Monetary Exchange Process
• Canonical example:
– specific dollar bill moving through the
economy
• Single object in only one place at a time
• Can travel between same pair more than
once
• A--B--C--B--C--D--E--B--C--B--C ...
Copyright © 2006 Steve Borgatti
Gossip Process
• Example:
– juicy story moving through informal network
•
•
•
•
Multiple copies exist simultaneously
Person tells only one person at a time*
Doesn’t travel between same pair twice
Can reach same person multiple times
* More generally, they tell a very limited number at a time.
Copyright © 2006 Steve Borgatti
E-Mail Process
• Example:
– forwarded jokes and virus warnings
– e-mail viruses themselves
• Multiple copies exist simultaneously
• All (or many) connected nodes told
simultaneously
– Except, perhaps, the immediate source
Copyright © 2006 Steve Borgatti
Influence Process
• Example:
– attitude formation
• Multiple “copies” exist simultaneously
• Multiple simultaneous transmission, even
between the same pairs of nodes
Copyright © 2006 Steve Borgatti
Viral Infection Process
• Example:
– virus which activates effective immunological
response or which kills host
• Multiple copies may exist simultaneously
• Cannot revisit a node
• A--B--C--E--D--F...
• Travels along graph-theoretic paths
Copyright © 2006 Steve Borgatti
Package Delivery Process
• Example:
– package delivered by postal service
• Single object at only one place at one time
• Map of network enables the intelligent
object to select only the shortest paths to
all destinations
Copyright © 2006 Steve Borgatti
Two Properties of Flow Processes
• Sequence type: path, trail, walk
– path: can’t revisit node nor edge (tie)
– trail: can revisit node but not edges
– walk: can revisit edges & nodes
• Deterministic vs non-deterministic
– blind vs guided
– always chooses best route; aware of map
• Combine into 4-way “traversal type”
property:
Copyright ©trails,
2006 Stevewalks
Borgatti
– geodesics, paths,
Traversal Dimension of Flow
• Sequence type: path, trail, walk
– path: can’t revisit node nor edge (tie)
– trail: can revisit node but not edges
– walk: can revisit edges & nodes
• Deterministic vs non-deterministic
– blind vs guided
– always chooses best route; aware of map
• Combine into 4-way “traversal type” property:
– geodesics, paths, trails, walks
Copyright © 2006 Steve Borgatti
Flow Method Property
• Duplication vs transfer (copy vs move)
– transfer/move: only one place at one time
– duplication/copy: multiple copies exist
• Serial vs parallel duplication
– serial: only one transmission at a time
– parallel: broadcast to all surrounding nodes
• Combine into “method” 3-way property:
– parallel dup., serial dup., transfer
Copyright © 2006 Steve Borgatti
Simplified Typology
goods
information
parallel
duplication
geodesics
paths
trails
walks
<no process>
internet nameserver
e-mail
broadcast
attitude
influencing
serial
duplication
mitotic
reproduction
transfer
package
delivery
viral infection
mooch
gossip
emotional
support
used goods
money
exchange
*Note: Names not to be taken too seriously.
Copyright © 2006 Steve Borgatti
Markov
Implications for SNA
• Measures like centrality are constructed by
counting things like paths or walks
• Which measure is sensible for a given
application depends on how things flow
– Different measures for different processes
• Most common flow processes do not have
measures designed for them
– Currently misapply std measures to all
situations
Copyright © 2006 Steve Borgatti
Components
• Maximal sets of nodes in which every
node can reach every other by some path
(no matter how long)
• A connected graph has just one
component
It is relations (types of tie) that define different
networks, not components. A graph that has two
components remains one (disconnected) graph.
Copyright © 2006 Steve Borgatti
Components in Directed Graphs
• Strong component
– There is a directed path from each member of
the component to every other
• Weak component
– There is an undirected path (a weak path)
from every member of the component to
every other
– Is like symmetrizing via maximum method
Copyright © 2006 Steve Borgatti
A network with 4 components
Who you go to so that you can say ‘I ran it by ____, and she says ...’
Recent acquisition
Older acquisitions
Original company
Data drawn from Cross, Borgatti & Parker 2001.
Copyright © 2006 Steve Borgatti
Length & Distance
• Length of a path is
number of links it has
• Distance between two
nodes is length of shortest
path (aka geodesic)
10
12
11
8
9
2
7
3
1
4
Copyright © 2006 Steve Borgatti
6
5
Austin Powers:
The spy who
shagged me
Let’s make
it legal
Robert Wagner
Wild Things
What Price Glory
Barry Norton
A Few
Good Man
Monsieur
Verdoux
Copyright © 2006 Steve Borgatti
Implications of Distance
• For something flowing across links,
distance correlates …
– Inversely with probability of arrival
– Inversely with time until arrival
– Positively with amount of distortion
or change
Copyright © 2006 Steve Borgatti
Geodesic Distance Matrix
a
b
c
d
e
f
g
a
0
1
2
3
2
3
4
b
1
0
1
2
1
2
3
c
2
1
0
1
1
2
3
d
3
2
1
0
2
3
4
e
2
1
1
2
0
1
2
f
3
2
2
3
1
0
1
g
4
3
3
4
2
1
0
Copyright © 2006 Steve Borgatti
Independent Paths
• A set of paths is node-independent if they
share no nodes (except beginning and end)
– They are line-independent if they share no lines
T
S
• 2 node-independent paths from S to T
• 3 line-independent paths from S to T
Copyright © 2006 Steve Borgatti
Cutpoint
• A node which, if deleted, would increase
the number of components
Bonnie
Bob
Biff
Bill
Betty
Betsy
Copyright © 2006 Steve Borgatti
s
Bridge
e
q
• A tie that, if removed,
would increase the
g
number of
components
b
d
a
m
p
f
l
h
i
j
c
k
n
r
Copyright © 2006 Steve Borgatti
o
Local Bridge of Degree K
• A tie that connects nodes that would
otherwise be at least k steps apart
A
B
Copyright © 2006 Steve Borgatti
Granovetter Transitivity
A
C
B
Copyright © 2006 Steve Borgatti
D
Granovetter’s SWT Theory
• Bridges are sources of novel information
• Only weak ties can be bridges
– Strong ties create g-transitivity
• Two nodes connected by a strong tie will have
mutual acquaintances (ties to same 3rd parties)
– Ties that are part of transitive triples cannot
be bridges or local bridges
– Therefore, only weak ties can be bridges
• Weak ties are sources of novel information
Copyright © 2006 Steve Borgatti
In other words …
• Strong ties are embedded in tight homophilous
and homogeneous clusters
• You don’t often hear new stuff from your strong
ties because
– Strong ties embedded in tight homogeneous and
homophilous clusters that recirculate same old stories
– You know all the same people as your strong ties
• Weak ties a source of novel information
• Strong ties are locally cohesive but weak ties are
globally cohesive
Copyright © 2006 Steve Borgatti