Models of the web graph - Department of Mathematics

Graduate Seminar
October 2015
Modelling, Mining, and
Searching Networks
Anthony Bonato
Ryerson University
1
21st Century Graph Theory:
Complex Networks
• web graph, social networks, biological networks, internet
networks, …
Networks - Bonato
2
• a graph G = (V(G),E(G))
consists of a nonempty set
of vertices or nodes V, and
a set of edges E
nodes
edges
• in directed graphs (digraphs) E need not
be symmetric
Networks - Bonato
3
Degrees
• the degree of a node x, written
deg(x)
is the number of edges incident with x
First Theorem of Graph Theory:
 deg(x)  2 | E(G) |
xV(G)
Networks - Bonato
4
The web graph
• nodes: web pages
• edges: links
• over 1 trillion
nodes, with billions
of nodes added
each day
Networks - Bonato
5
Ryerson
Nuit
Blanche
City of
Toronto
Four
Seasons
Hotel
Frommer’s
Greenland
Tourism
Networks - Bonato
6
Small World Property
• small world networks
introduced by Watts &
Strogatz in 1998
– low distances
between nodes
Networks - Bonato
7
Power laws in the web graph
• power law degree distribution
b
Ni ,n  i n, some b  2
(Broder et al, 01)
Networks - Bonato
8
Geometric models
• we introduced a
stochastic network model
which simulates power
law degree distributions
and other properties
– Spatially Preferred
Attachment (SPA)
Model
• nodes have a region of
influence whose volume
is a function of their
degree
Networks - Bonato
9
SPA model (Aiello,Bonato,Cooper,Janssen,Prałat, 09)
• as nodes are born,
they are more
likely to enter a
region of influence
with larger volume
(degree)
• over time, a
power law
degree
distribution
results
Networks - Bonato
10
Networks - Bonato
11
Biological networks: proteomics
nodes: proteins
edges:
biochemical interactions
Yeast:
2401 nodes
11000 edges
Networks - Bonato
12
Protein networks
• proteins are essential
macromolecules of life
• understanding their
function and role in
disease is of importance
• protein-protein interaction
networks (PPI)
– nodes: proteins
– edges:
biochemical
interaction
Networks - Bonato
13
Domination sets in PPI
(Milenkovic, Memisevic, Bonato, Przulj, 2011)
PLOS ONE
• dominating sets in graphs
• we found that dominating sets in
PPI networks are vital for normal
cellular functioning and signalling
– dominating sets capture biologically
vital proteins and drug targets
– might eventually lead to new drug
therapies
Networks - Bonato
14
Social Networks
nodes: people
edges:
social interaction
Networks - Bonato
15
On-line Social Networks (OSNs)
Facebook, Twitter, LinkedIn, Google+…
Networks - Bonato
16
Bieber to Pope Francis
on
Networks - Bonato
17
6 Degrees in Facebook?
• 1.15 billion users
• (Backstrom et al., 2012)
– 4 degrees of separation in
Facebook
– when considering another
person in the world, a friend of
your friend knows a friend of
their friend, on average
• similar results for Twitter
and other OSNs
Networks - Bonato
18
Dimension of an OSN
• dimension of OSN: minimum number of
attributes needed to classify nodes
• like game of “20 Questions”: each
question narrows range of possibilities
• what is a credible mathematical formula
for the dimension of an OSN?
Networks - Bonato
19
GEO-P model
(Bonato et al, 2014): PLOS ONE
• reverse engineering approach
– given network data GEO-P model predicts dimension
of an OSN to be around log n, where n is the number
of users
• that is, given the graph structure, we can (theoretically)
recover the social space
Networks - Bonato
20
6 Dimensions of Separation in
Facebook and LinkedIn
Networks - Bonato
21
Cops and Robbers
C
C
R
C
Networks - Bonato
22
Cops and Robbers
C
C
R
C
Networks - Bonato
23
Cops and Robbers
C
R
C
C
cop number c(G) ≤ 3
Networks - Bonato
24
Cops and Robbers
• minimum number of cops needed to
capture the robber is the cop number
c(G)
–well-defined as c(G) ≤ |V(G)|
Networks - Bonato
25
Applications of Cops and Robbers
• robotics
– mobile computing
– missile-defense
– gaming
• counter-terrorism
– intercepting messages
or agents
Networks - Bonato
26
How big can the cop number be?
• if the graph G with order n is disconnected, then
the cop number can be as n
• if G is connected, then no one knows how big
the cop number can be!
• Meyniel’s Conjecture: c(G) = O(n1/2).
Networks - Bonato
27
Good guys vs bad guys games in graphs
bad
good
slow
slow
medium
fast
helicopter
eternal
security
traps, tandem-win
medium
robot vacuum
Cops and Robbers
edge searching
fast
cleaning
distance k Cops
and Robbers
Cops and Robbers The Angel
on disjoint edge
and Devil
sets
seepage
Helicopter Cops
and Robbers,
Marshals, The
Angel and Devil,
Firefighter
helicopter
Networks - Bonato
Hex
28
Networks - Bonato
29
Thesis topics
•
•
•
•
•
new models of complex networks
biological network models
Banking/financial networks
fitting models to data
Cops and Robbers games
– Meyniel’s conjecture, random graphs,
variations: good vs bad guy games in graphs
Networks - Bonato
30
Brief biography
• over 90 papers, two original books, 7 edited proceedings books, with
61 collaborators (many of which are my students)
• over 480K lifetime research
– grants from NSERC, MITACS, Mprime, and Ryerson
– FOS accelerator (additional support available in Y1)
• supervised 12 masters students, 2 doctoral, and 13 post-docs
• over 30 invited addresses world-wide (India, China, Europe, North
America)
• won 2011 and 2009 Ryerson SRC awards for research excellence
• won 2013 an inaugural YSGS Outstanding Contribution to Graduate
Education Award
• editor-in-Chief of journal Internet Mathematics; editor of
Contributions to Discrete Mathematics
Networks - Bonato
31
Drop in office hours
• Wednesday, November 4, 10 am – 12 pm
• Thursday, November 5, 10 am – 12 pm
• Yeates School of Graduate Studies
• 11th floor of 1 Dundas St West, YDI – 1117
• Come to say hello, chat, discuss thesis
topics
Networks - Bonato
32
AM8204 – Topics in Discrete
Mathematics
• Winter 2014
• 6 weeks each: complex networks, graph
searching
• project based
• Prequisite: AM8002 (or permission from
me)
Networks - Bonato
33
Graphs at Ryerson (G@R)
Networks - Bonato
34