Exercise 2 Preliminaries 1 Gowalla (2 Points) 2 Degree Distribution

Exercise 2
Due date: May 1 2013
Prof. Steffen Staab, Dr. Jérôme Kunegis
Exercise 2
1
2
3
P
2
4
4
10
Preliminaries
In this exercise, we will work with a real social network, instead of using a synthetic network. We consider
the social network dataset of Gowalla, a discontinued location-based social network. This social network
contains about 196,000 users and 1,900,000 ties. This network takes about 25 megabytes in memory when
represented as a sparse matrix, but about 450 gigabytes when represented as a dense matrix. Thus, it is
necessary to only use sparse matrix representations throughout this exercise.
In this exercise, we will start generating plots from network data. Use the command print -depsc to
save your plot to an EPS file (named *.eps) and submit the EPS file along with your submission.
Gowalla (2 Points)
Download the Gowalla dataset in TSV format1 (tab separated values). In the TSV tarball, the file out.loc-gowalla edges contains
the adjacency matrix of the Gowalla social network in sparse text
format. Note that in that file, each edge (u, v) is included as both
u v and v u, i.e., the matrix is already symmetric.
• Load the dataset using the function load sparse() developed in the previous exercise.2
5
10
4
10
Frequency
1
3
10
2
10
1
10
0
• Use the function compute statistics() from the previous
10 0
1
2
3
4
5
10
10
10
10
10
10
Degree (d)
exercise to compute the size and the volume of the Gowalla
social network. Compare the results with the values given on
Figure 1: The degree distribution of
konect.uni-koblenz.de/networks/loc-gowalla edges.
the Gowalla social network.
2
Degree Distribution (4 Points)
The degree distribution of a network is defined as a vector C such
that C(d) equals the number of vertices with degree d. The vector
C should have a length equal to the maximal degree in the network.
• Write a function that computes the degree distribution of a given network. You may make use of
the function compute degrees() developed in the previous exercise. The function must have the
following signature:
function [C] = degree_distribution(A)
1
2
konect.uni-koblenz.de/networks/loc-gowalla edges
Note that the function load sparse() should always return a square, symmetric matrix.
Page 1 of 2
Exercise 2
Due date: May 1 2013
Prof. Steffen Staab, Dr. Jérôme Kunegis
• Apply the function to the Gowalla social network.
The degree distribution can be visualized by plotting the value C(d) in function of d.
• Use the GNU Octave function loglog() to plot the degree distribution, with d on the X axis and
C(d) on the Y axis. Your plot must include labels for the X axis and the Y axis. Your plot should
look like the plot in Figure 1.
3
Cumulative Degree Distribution (4 Points)
Let G = (V, E) be a graph. For a given degree d, the cumulative
degree distribution in a network is defined as the proportion of
nodes in V whose degree is greater or equal to d. In other words,
the cumulative distribution of G is given by
|{v ∈ V | d(v) ≥ d}|
.
|V |
−1
10
P(x ≥ d)
P (x ≥ d) =
0
10
−2
10
−3
10
−4
10
−5
The cumulative degree distribution can be stored as a vector P such
that P(d) contains the value P (x ≥ d). The length of the vector
P should equal the network’s largest degree.
10
1
10
2
3
10
10
Degree (d) [vertices]
4
10
Figure 2: The cumulative degree dis• Write a function that computes the cumulative degree distribution of the Gowalla social nettribution of a given network. The function must have the
work.
following signature:
function [P] = cumulative_degree_distribution(A)
• Apply the function to the Gowalla social network.
The degree distribution can be visualized using a stairstep
plot.
• Use the GNU Octave function stairs() to plot the cumulative degree distribution. Your plot must
include labels for the X axis and the Y axis. Your plot should look like the plot in Figure 2. To
make sure your plot uses a log-log scale, use set(gca, ’XScale’, ’log’, ’YScale’, ’log’).
Tips
• How does the degree distribution look when not plotted on logarithmic axes?
• What is a power law degree distribution? Does the Gowalla social network have a power law degree
distribution? Why?
• Can you plot the degree distribution for larger networks from konect.uni-koblenz.de/networks?
• What do the functions sort and cumsum do?
Page 2 of 2