Advanced Computer Networks: Part 2

Selected Topics in Data
Networking
Explore Social Networks: Cohesion
Introduction to Cohesion

Social Network Analysis:

Investigating who is related and who is not.


Why are some people or organizations related,
whereas others are not?
People who match on social
characteristics will interact more often and
people who interact regularly will foster a
common attitude or identity.
2
Introduction to Cohesion

Social interaction

Basis for solidarity, shared norms, identity, and
collective behavior


People who interact intensively are likely to consider
themselves a social group.
Expecting similar people to interact a lot, at least
more often than with dissimilar people.

This phenomenon is called homophily: love of the
same (the tendency of individuals to associate
and bond with similar others)

Does the homophily principle work?
3
Introduction to Cohesion

Study in the Turrialba region, which is a rural area in Costa Rica
(Latin America).

Visual impression of the kin visits network and the family–friendship
groupings, which are identified by the colors and numbers within the
vertices
4
Meaning: Cohesion

Cohesion means a social network contains
many ties.

Community cohesion refers to the aspect of
togetherness and bonding exhibited by members
of a community, the “glue” that holds a community
together.


More ties between people yield a tighter structure, which
is more cohesive.
The density of a network captures this idea.
Source: https://digestiblepolitics.wordpress.com/2013/01/04/the-importance-of-community-cohesion-in-society-in-2013/
5
Meaning: Cohesion

Review


Multiple lines between vertices and higher line
values indicate more cohesive ties.
Density = the number of edges divided by the number possible.


If self-loops are excluded, then the number possible is n(n-1)/2.
If self-loops are allowed, then the number possible is n(n+1)/2.
6
Cohesion: Indicated by Density


In the kin visiting relation network, density is 0.045,
which means that only 4.5 percent of all possible arcs
are present.
Density is inversely related to network size:


The larger the social network, the lower the density because
the number of possible lines increases rapidly with the number
of vertices, whereas the number of ties which each person can
maintain is limited.
Discussion: Why?

Network density is not very useful
7
Cohesion: Indicated by Degree

The number of ties in which each vertex is
involved.

Degree of a vertex.


Vertices with high degree are more likely to be found in
dense sections of the network.
Review (Undirected Graph)
Cohesion: Comparing between Density and Degree
8
Cohesion: Indicated by Degree

A higher degree of vertices yields a denser
network


because vertices entertain more ties.
Average degree of all vertices: Measuring the
structural cohesion of a network.

This is a better measure of overall cohesion
than density

It does not depend on network size, so average degree can be
compared between networks of different sizes.
9
Cohesion: Indicated by Degree

NOTE

Directed Graph: the sum of the indegree and the
outdegree of a vertex does not necessarily equal
the number of its neighbors
10
Cohesion: Indicated by Component


Vertices with a degree of one or higher are
connected to at least one neighbor, so they
are not isolated.
If the network is cut up in pieces.

Isolated sections of the network may be regarded
as cohesive subgroups


because the vertices within a section are connected,
whereas vertices in different sections are not.
The connected parts of a network are called
components
11
Cohesion: Indicated by Component



“Singletons,” who have no connections and are least central
The “giant component,” which is the largest group of nodes tightly
connected to the central nodes and to each other
The “middle region,” which represents isolated groups which interact
amongst themselves but not with the rest of the network, forming
isolated stars.
Source: http://boxesandarrows.com/socialnetworks-and-group-formation/
12
Cohesion: Indicated by Component
13
Cohesion: Indicated by Component

A network is weakly connected – if all
vertices are connected by a semipath.

In a (weakly) connected network, we can
“walk” from each vertex to all other vertices if
we neglect the direction of the arcs.
14
Cohesion: Indicated by Component

In directed networks, there is a second type
of connectedness: a network is strongly
connected if each pair of vertices is
connected by a path.

In a strongly connected network, we can travel
from each vertex to any other vertex obeying the
direction of the arcs.
15
Cohesion: Indicated by Component

Strong connectedness is more restricted than weak
connectedness:

Each strongly connected network is also weakly connected
but a weakly connected network is not necessarily strongly
connected.
16
Cohesion: Indicated by Component

Vertices v1, v3, v4, and v5 constitute a
(weak) component

because they are connected by semipaths and
there is no other vertex in the network which is
also connected to them by a semipath.
17
Cohesion: Indicated by Component

A (weak) component is a maximal (weakly)
connected subnetwork.

A strong component, which is a maximal
strongly connected subnetwork.
18
Cohesion: Indicated by Component

The example network contains three strong
components.

The largest strong component is composed of
vertices v3, v4, and v5, which are connected by
paths in both directions.
19
Cohesion: Indicated by Component

There are two strong components consisting
of one vertex each, namely vertex v1 and v2.

Vertex v2 is isolated and there are only paths from
vertex v1 but no paths to this vertex, so it is not
strongly connected to any other vertex.

It is asymmetrically linked to the larger strong
component.
20
Cohesion: Indicated by Component

In an undirected network, lines have no
direction

Each semiwalk is also a walk and each semipath
is also a path.

Components are isolated from one another, there are no
lines between vertices of different components.

This is similar to weak components in directed networks.
21
Cohesion: Indicated by Component

Components can be split up further into
denser parts by considering the number of
distinct, that is, noncrossing, paths or
semipaths that connect the vertices.

Within a weak component, one semipath between each pair of
vertices suffices but there must be at least two different
semipaths in a bi-component.

k-connected components: maximal subnetworks in which each
pair of vertices is connected by at least k distinct semipaths or
paths.
22
Cohesion: Indicated by Core

The distribution of degree reveals local concentrations of
ties around individual vertices


but it does not tell us whether vertices with a high degree are
clustered or scattered all over the network.
Using degree to identify clusters of vertices that are
tightly connected

because each vertex has a particular minimum degree within the
cluster.
23
Cohesion: Indicated by Core

Paying no attention to the degree of one
vertex but to the degree of all vertices
within a cluster.

These clusters are called k-cores

k indicates the minimum degree of each vertex
within the core

Ex: 2-core contains all vertices that are connected by
degree two or more to other vertices within the core.
24
Cohesion: Indicated by Core

A k-core identifies relatively dense
subnetworks

so they help to find cohesive subgroups.
25
Cohesion: Indicated by Core

Undirected network:

the degree of a vertex is equal to the number of its
neighbors,

k-core contains the vertices that have at least k
neighbors within the core
26
Cohesion: Indicated by Core



All vertices belong to the 1-core, which is drawn in black
One vertex, v5, has only one neighbor, so it is not part of
the 2-core
Vertex v6 has a degree of 2, so it does not belong to the
3-core

k-cores are nested: a vertex in a 3-core is also part of a 2-core,
but not all members of a 2-core belong to a 3-core.
27
Cohesion: Indicated by Core

Different cohesive subgroups within a k-core are usually connected
by vertices that belong to lower cores


Vertex v6, which is part of the 2-core, connects the two segments of the
3-core.
Eliminating the vertices belonging to cores below the 3-core,

Obtain a network consisting of two components, which identify the cohesive
subgroups within the 3-core.
28
Cohesion: Indicated by Core

How k-cores help to detect cohesive
subgroups?


Removing the lowest k-cores from the network
until the network breaks up into relatively dense
components.
Each component is considered to be a cohesive
subgroup

because they have at least k neighbors within the
component.
29
30
Selected Topics in Data
Networking
Explore Social Networks: Prestige and
Ranking
Introduction


Prestige is conceptualized as a particular pattern of
social ties.
In directed networks, people who receive many positive
choices are considered to be prestigious.
Source: http://pursuitist.com/lady-gaga-rules-twitter/

If everybody likes to play with the most popular girl or boy in a
group but he or she does not play with all of them.
32
Introduction

Popularity and Indegree: Prestige

When ties are associated to some positive
aspects such as friendship or collaboration,



indegree is often interpreted as a form of popularity,
outdegree is interpreted as gregariousness.
A prestigious art museum receives more attention
from art critics than less prestigious ones.
Source: http://en.wikipedia.org/wiki/Centrality
33
Introduction

The simplest measure of structural prestige is called
popularity and it is measured by the number of choices a
vertex receives: its indegree

In undirected networks, we cannot measure prestige;
instead, we use degree as a simple measure of
centrality.
34
Introduction

We should note that indegree does reflect
prestige if we transpose the arcs in such a
network, that is, if we reverse the direction of
arcs.

It is interesting to note that several structural
properties of a network do not change when
the arcs are transposed
35
Correlation

Structural prestige scores: Correlation
coefficients range from -1 to 1

A positive coefficient indicates that a high score on one feature is
associated with a high score on the other (e.g., high structural
prestige occurs in families with high social status).

A negative coefficient points toward a negative or inverse
relation: a high score on one characteristic combines with a low
score on the other (e.g., high structural prestige is found
predominantly with low social status families).
36
Correlation

There is no correlation if the absolute value of
the coefficient is less than (+/-)0.05.




If the absolute value of a coefficient is between 0.05 and 0.25
(and from -0.05 to-0.25), association is weak: Positive, Negative
Coefficients from 0.25 to 0.60 (and from −0.25 to −0.60) indicate
moderate association: Positive and Negative
Coefficients from 0.60 to 1.00 (or −0.60 to −1.00) is interpreted as
strong association : Positive and Negative
Coefficient of 1 or −1 is said to display perfect association :
Positive and Negative
37
Domains

Popularity is a very restricted measure of prestige because it takes
only direct choices into account.

This is the input domain of an actor, which has been called the
influence domain because structurally prestigious people are
thought to influence people who regard them as their leaders.

The larger the input domain of a person, the higher his or her
structural prestige.

The output domain is more likely to reflect prestige in the case of a
relation such as “lend money to”.
38
Proximity Prestige

Limit the input domain to direct neighbors or to neighbors
at maximum distance two on the assumption that
nominations by close neighbors are more important than
nominations by distant neighbors.

An indirect choice contributes less to prestige if it is
mediated by a longer chain of intermediaries.
39
Proximity Prestige

Proximity Prestige: This index of prestige considers all
vertices within the input domain of a vertex but it
attaches more importance to a nomination if it is
expressed by a closer neighbor.

A nomination by a close neighbor contributes more to
the proximity prestige of an actor than a nomination by a
distant neighbor, but many “distant nominations” may
contribute as much as one “close nomination.”
40
Proximity Prestige

To allow direct choices to contribute more to the prestige
of a vertex than indirect choices, proximity prestige
weights each choice by its path distance to the vertex.

A higher distance yields a lower contribution to the
proximity prestige of a vertex, but each choice
contributes something.
41
Proximity Prestige





A larger input domain (larger numerator) yields a higher proximity
prestige because more vertices are choosing an actor directly or
indirectly.
A smaller average distance (smaller denominator) yields a higher
proximity prestige score because there are more nominations by
close neighbors.
Maximum proximity prestige is achieved if a vertex is directly chosen
by all other vertices.
The proportion of vertices in the input domain is 1 and the mean
distance from these vertices is 1, so proximity prestige is 1 divided
by 1.
Vertices without input domain get minimum proximity prestige by
definition, which is zero.
42
Proximity Prestige





All vertices at the extremes of the network (v2, v4, v5, v6,and v10)
have empty input domains, hence they have a proximity score of
zero.
The input domain of vertex v9 contains vertex v10 only, so its input
domain size is 1 out of 9 (.11).
Average distance within the input domain of vertex v9 is one, so the
proximity prestige of vertex 9 is .11 divided by 1 = 0.11.
Vertex v1 has a maximal input domain (9 out of 9 =1),
Average distance is 2.0, so proximity prestige amounts to 1.00
divided by 2.0, which is .5.

Avg.dist. = (4+3+2+1+1+1+2+2+2)/9 = 2
43
References

Wouter de Nooy, Andrej Mrvar, and Vladimir
Batagelj, Exploratory Social Network Analysis
with Pajek, Cambridge
44