Stephen B. SEIDMAN The identification of cohesive subsets of

Social
Networks
5 (1983) 97- 107
97
North-Holland
INTERNAL
COHESION
OF LS SETS IN GRAPHS *
Stephen B. SEIDMAN
Georgr Muson CJtmersrr.~**
Let G be a finite
connected
graph.
A set of vertices
N c V(G)
is called
a LS ser if for every
proper subset K c H, there are more edges linkmg K to H-K
than there are linking K to
H. Smce “cliques”
in social networks
have usually
been seen informally
as sets of
V(C)indwiduals
more closely tied to each other than to outsiders, LS sets prowde a natural reahzatlon
of the “clique” concept. In this paper. it is shown that LS sets m socml networks have cohesive
properties
that make them even more useful for empirical
analyses.
In particular,
subgraphs
induced
by LS subsets remain connected
even after several edges have been removed.
Results
bounding
the number of edges that can be so removed are used to get an upper bound for the
dmmeter of subgraphs mduced by LS subsets.
1. Introduction
The identification of cohesive subsets of social networks has been an
important theme in social network analysis since Moreno (1934) first
introduced sociograms. Informally, they have usually been seen as
regions of a network for which the internal ties are more significant
than the external ties. Significance, in this sense, has been assessed in
many ways, depending both on the nature of the relationship generating the network and on the goal of the research. For example, we can
simply count the numbers of internal and external ties and compare
them, or we can compare the proportions of the potential internal and
external ties that have actually been achieved. On the other hand,
measures can be assigned to network links and these measures can be
used to compare the internal and external links. As an example, if the
links represent communication, such measures as number of communications sent or total communication
time have frequently been used
(Killworth and Bernard 1976).
* The research
on which
this paper
is based
was carried
Science Foundation,
Grant No. BNS 80-13507.
** Department
of Mathematical
Sciences, George
0378.8733/83/$3.00
0 1983. Elsewer
Science
Mason
Publishers
out under
University,
support
from
Fairfax.
VA 22030.
B.V. (North-Holland)
the National
U.S.A.
98
S.B. Seidmun / Internol cohesmn
of LS
sets
Once it has been determined
how significance
is to be assessed, it still
remains to ask how the internal and external ties are to be compared.
Statistical
techniques
have often been used (particularly
factor analysis)
to group the members of a population
into relatively cohesive units (see
(Lankford
1974) for an overview
of several statistical
approaches).
Although
sociologists
have long used the term “clique”
to refer to
cohesive regions of networks, it was only in 1949 that Lute and Perry
proposed that if the cohesive regions are taken to be maximal complete
subgraphs
(i.e., what graph theorists call cliques), then maximal internal cohesion will be attained,
since all possible internal
links will be
present. This definition
did not in general give rise to useful collections
of cohesive subsets, primarily
because most naturally
occurring
social
networks contain few nontrivial
Lute-Perry
cliques. As a consequence,
the definition
given by Lute and Perry was generalized
in several
directions
(Alba 1973; Lute 1950; Mokken
1979; Seidman and Foster
1978a). Although
these generalizations
have yielded cohesive subsets
that have proved to be empirically
useful (Foster
1980; Seidman
and
Foster 1978a), they all suffer from the defect that they focus attention
almost exclusively on the internal ties, to the neglect of the ties linking
the cohesive region to its complement.
In Seidman ( 1981b), it was proposed
that the LS sets introduced
by
Luccio, Sami, and Lawler (Lawler 1973; Luccio and Sami 1969) would
make excellent
prototypes
for cohesive
subsets
of social networks.
Roughly, a set of nodes S in a social network is an LS set if each of its
proper subsets has more ties to its complement
within S than to the
outside of S. This rather vague definition
works equally well if the
social network
represents
dyadic
relationships
between
individuals,
such as communication
or friendship,
or non-dyadic
relationships
among sets of individuals,
such as committee
membership
or attendance at important
social events (see Seidman (1981a) for a discussion
of hypergraph
models for non-dyadic
networks).
In addition,
the edges
in the network can be weighted, if necessary.
In this paper, it will be shown that although LS sets were defined by
carefully comparing
the internal
and external ties of sets of nodes, LS
sets in graphs also have strongly cohesive internal properties.
In particular, the subgraphs
induced
by LS sets remain
connected
even after
several ties have been removed.
Upper bounds
are obtained
for the
number
of links that can safely be deleted.
Since ties in empirical
networks are likely to disappear
over time, this “robustness”
of LS sets
S.B. Seidman
/ Internal
makes them even more attractive
subsets in dyadic social networks.
2. Definitions,
notation
cohesron of LS sea
candidates
and fundamental
99
as models
for cohesive
properties
All graphs will be assumed
to be finite. If G is a graph and
H c V(G), let a(H) be the number of edges of G incident with both H
and V(G) - H and let G, be the subgraph of G induced by the vertices
of H. X(G) will denote the edge-connectivity
of the graph G, and 6(G)
will denote the minimum
degree of the vertices of G. If x is a real
number,
x and x will denote the greatest integer
2 x and the least
integer 2X, respectively.
A subset H of the vertex set of a graph G is an LS set (Lawler 1973;
Luccio and Sami 1969) if for any proper subset KC H, (Y(K) > a(H).
V(G) is always a LS set, as is {x} for each x E v(G). If H is a LS set
this
with 1 -C IHI c IV( G)I, H is called a nontrivial LS set. Although
definition
relates the number of internal and external ties in an interesting way, it is not clear how it can be made to fit into the cohesive-subset perspective outlined above. The necessary connection
is provided by
the following result (Seidman
198 1b) :
Theorem 2.1.
Let G be a graph. Then H c V(G) is a LS set if and only
if for any proper K c H, fewer edges join K to V(G) - H than join K to
H - K.
As a consequence
subsets which have
than external
links.
overlap in complex
tightly constrained.
Sami (Lawler 1973;
easily from Theorem
of this result, we see that LS sets can be defined as
(in a precisely defined sense) more internal
links
Although
cliques and their generalizations
can
patterns,
the overlap among LS sets is far more
In particular.
the following result of Luccio and
Luccio and Sami 1969; Seidman
198 lb) follows
2.1.
Theorem 2.2.
Let H and K be LS subsets
G.ThenifHnK*&eitherH~KorKcH.
of the vertex
Extensions
of these results to hypergraphs
(with
weighted edges) can be found in Seidman (1981b).
set of a graph
weighted
It should
or unalso be
noted that these results depend heavily on the fact that the property
CX(K) > a(H) must hold for all proper subsets K of H. if a k-LS sef is
defined to be a set H c V(G) for which CY(K) > a( H) for any proper
KC H with [K[ I k, Theorems 2.1 and 2.2 do not hold for such sets.
3. Internal cohesion
of LS sets
If H is a LS subset of V(G), the structural
subgraph G, must arise from the interplay of
fact that H is a LS subset. The most obviously
are JHI and CX(H), which are related (for LS
result.
Proposition 3.1.
a(H)+
1.
If H is a nontrivial
properties of the induced
parameters
of H with the
relevant parameters
of H
subsets) by the following
LS subset
of V(G),
then
lH[ >
Proof
Let x0 be the member of H that is adjacent to the fewest points
of I’(G) - H; x,, can be adjacent to at most a( H)/lHI such points. It
then follows that
a((~~))= deg(x,)~(dff)/IHl)+
IHI - 1.
Since H is a LS set, a({~,>) > a(H), so that a(H)/IHI+
IHI - 1 >
CY(H). As a consequence,
(u(H)/IHI
+ IHI - 1 > CX(H), and this inequality can be transformed
to a( H)( 1 - 1H l/l H I) > 1 - IH 1. Since H is
nontrivial,
IHI > 1, and we can conclude that IHI > a(H). But if IH I =
to no point of V(G) - H.
a(H) + 1, there must exist x E H adjacent
Then a((~)) < a(H), which contradicts
the assumption
that H is a LS
set.
It is easy to construct
examples of graphs containing
LS subsets H
with CX(H) = k and IHI = k + 2 for any k 2 1, so that Proposition
3.1
gives a best possible
bound.
Although
this result does not directly
address the internal
cohesion
of LS sets, it is a useful tool in the
formulation
of other results and the construction
of examples.
The next result shows that the subgraph induced by a LS subset will
remain connected
even after the removal of many of its edges.
S. B. Serdman
/
Internal
If H is a nontrivial
cohesion
of LS W,S
Proposition
>cuo/2.
3.2.
Proof
Let
disconnects
nents, with
to points of
= a(H), k,
h( GH) = I, and let L be a set of I edges whose removal
G,. If these edges are removed, G, splits into two compovertex sets H, and H,, and all edges of L join points of H,
H,. Then (u( H,) = k, + I and a( H,) = k, + I, where k, + k,
so
2 0, k, > 0. But then we can suppose that k, <a( H)/2.
that
a(H,)s(a(H)/2)+1.
subset of H, a( H,) > a(H),
which implies that
LS subset
101
of V(G),
S’
mce H is a LS set and
then A(G,)
H, is a proper
so that we must have (a( H)/2)+
I > a(H),
h(G,)=l>o/2.
It is easy to find examples of LS sets H with h(G,) =m+
1
for any value of a(H),
so that the bound given here is best possible.
Since for any nontrivial
LS subset H we must have CX(H) 2 1, we can
immediately
draw the following corollary:
Corollary
3.3.
For any nontrivial
LS subset
H c V(G), X(G,)
2 2.
In general, then, LS sets are relatively insensitive
to the removal of
edges, because
at least one edge can always be removed
without
disconnecting
G,. Such stability is important
if cohesive subsets are to
be used in empirical
analyses.
For example,
if the links represent
mutual communication,
links may be removed (or made far less significant) because some pair of individuals
may no longer live close enough
for easy communication.
A cohesive subset is likely to be longer-lasting
(and as a consequence
more salient and/or
more effective) if it is still
connected
even after some links are removed.
Analogous
arguments
have been made for generalized
cliques (k-plexes)
with regard to the
removal of points (Seidman
and Foster 1978a: Corollary
5; Seidman
and Foster 1978b: 68). Unfortunately,
subgraphs induced by LS subsets
can have cutpoints
and thus can be quite sensitive to the removal of
points. On the other hand, edge-connectivity
does not seem to be useful
in the characterization
of the internal
structure
of k-plexes. It seems
clear that this distinction
should play a role in determining
whether
k-plexes
or LS sets are used in the analysis
of an empirical
social
network.
S. B. Seidmon / Inrernnl cohesion of LS sets
102
We can now ask how the bound that we have obtained
for h(G,)
can be used to get further structural
information
on G,. Since for any
graph G, 6(G) 2 h(G) (Harary 1969: 43), we have immediately
Corollary
>ao/2.
3.4.
For
any
nontrivial
LS
subset
H c V(G).
6( G,)
If we again consider the empirical example of a network generated
by mutual communication,
it is important
to know how many links
may be needed to transmit a message between any two members of a
cohesive subset. Thus cohesive subsets with small diameter may prove
to be more useful in applications,
and as a consequence
the construction of cohesive subsets have often (directly or indirectly)
involved
diameter.
The clique concept introduced
by Lute and Perry (1949)
yielded subsets of diameter
1, and the extensions
due to Lute (1950),
Alba (1973) and Mokken
(1979) directly
involved
bounds
on the
diameter. Alternatively,
the present author has shown in a paper with
Brian Foster (Seidman
and Foster 1978a) that k-plexes with n points
must have diameter
I 2 if k < (n + 2)/2. We will be able to use the
bound on edge-connectivity
given in Proposition
3.2 to obtain upper
bounds on the diameter of the subgraphs
induced by LS subsets of a
graph.
If we use the bound on minimum
degree given in Corollary
3.4, we
can immediately
get an upper bound on diameter. To do so, we first
state a result that gives an upper bound on diameter as a function
of
the number of points and the minimum
degree.
Proposition
3.5. (Kramer
1972).
If G is a connected
graph with IZ
points and minimum
degree 6 > 1, then diam( G) I 3( u/( 6 + 1)) - 1.
It should be noted that a slightly better result (actually
a best
possible result, as a function of n and 8) can be obtained if attention
is
paid to the remainder
obtained when n is divided by 6 + 1. Since it will
not enable us to improve the diameter bound for subgraphs induced by
LS sets, it will be omitted here.
Kramer’s
results, along with Corollary
3.4, immediately
yields an
upper bound on diameter.
Corollary
3.6.
If H is a nontrivial
LS subset
of V(G),
then
S.B. Seidman
diam(G,)
/ Internal
I 3(]H]/[(o/2)
cohesion of LS sets
103
+ 21) - 1
This bound
suffers from the disadvantage
that it only uses the
edge-connectivity
indirectly.
It is thus not surprising
that it is not a best
possible bound. For example, if we consider LS sets H with 1H I= 8 and
(Y(H) = 2, Corollary
3.6 would imply that diam( GH) I 5, but we will
see below that a best possible
bound of 4 can be obtained
for the
diameter in this case. In order to attain better bounds, we will try to use
the edge-connectivity
directly, just as Kramer’s result uses the minimum
degree directly. We will first prove a lemma which gives a lower bound
on the number of vertices that a graph of given edge-connectivity
and
diameter must have.
Lemma
3.7. If G is a connected
edge-connectivity
A, then
graph
with n points,
diameter
d and
if dis odd,
and
n2
! If
2A”2+
if d is even.
1,
Proof There exists
a path of length d from u to 0, for some U, 0 E V(G).
Let S, be the set of points in G at distance i from U. Suppose first that d
is odd. Let T = S,, U S,,, , for i = 0, 1,. . . ,( d - 1)/2. By the edge
version of Menger’s theorem (Harary
1969: 50), there are A edge-disjoint paths from u to u, and these paths must yield X distinct
edges
for
each
i.
Suppose
now
that
(for
some
i)
joining points of S,, to &+,
IT,] = p. It is easy to see that if p2 < 4h, A distinct edges of this type
cannot
exist.
Thus
p2 2 4A, so thq,]
22A’12
for each
i. It follows
easily that n 2 [(d - 1)/2]2X”2+2A”2,
since each of the sets
gives rise to two consecutive
path edges, whileA_
T , . . .,qd_3j,2
&,
,),2
may give rise to only one. We conclude that n 2 [((d + 1)/2)]2X”2.
If d is even, we construct the sets q as above for i = 0, 1,. . . , (d ~ 2)/2.
Just
as above,(7)122X 1/2
that n 2 ( d/2)2X”2
+ 1.
for each i, and clearly
IS,] 2 1. We conclude
S E.
104
From
diameter
2n
d, I=2x”2
Serdmnn / Interml cohmon of LS .se,s
this lemma, we immediately
obtain two upper
of a graph with n points and edge-connectivity
1
and
bounds
A:
for the
2
2n
d, I~--=.
2x”2
2x”2
Since X 2 1, d, > d, for all n and X. Let x,,,,, denote
integer I X. We then immediately
obtain the following
the largest
result.
even
Proposition 3.8 If G is a connected
graph with n points and edge-connectivity X, then diam( G) I max(d,, ( dz)even), where d,. d, are defined
as above.
Although this bound will not be best possible for all values of n and
X, it does give a best possible bound for small values of A. In particular.
if X = 1 it yields diam( G) I n - 1;
if A = 2 it yields diam( G) I
n3(n-
if A = 3 it yields diam(G)
I
ifn = 0 or Z(mod
1
+/4)
~
i fn - 1
1)
ifn-
3)
l(mod3);
ifn-1,2,or3(mod
4
if n = O(mod 4) ;
and all of these bounds are best possible.
It is interesting
to compare Proposition
3.8 with analogous results for
which the minimum
degree or connectivity
of G are specified, rather
than the edge-connectivity.
If the minimum
degree is given, Proposition
3.5 gives a result that can easily be made to yield a best possible bound.
If the connectivity
K is given,
a result of Watkins (1967) gives diam( G)
5 ((n - 2)/~) + 1, and this bound is also best possible.
It would be
very interesting
and useful to have a best possible bound for diam( G)
as a function of n and h, and Proposition
3.8 is a partial step in that
direction.
Just as Corollary
3.6 was obtained
from Proposition
3.5, we can
use Proposition
3.8 to bound
the diameter
of induced
subgraphs
of LS subsets of V(G). From Proposition
3.2, we know that X(G,,)
105
1. If we then define
kw+
b(H)
=2[((u(Hj/2)
d,(H)==-
b(H)
+ l]“*.
1
and
we can state
Corollary 3.9. If His a LS subset of V(G), then diam( GH) I max{ d,( H),
(d,(H)),“,“).
If we combine this result with the comments
following Proposition
3.8, it follows that we have obtained best possible diameter bounds for
LS subsets H with a(H) I 4.
The preceding
results have used edge-connectivity
as a tool to
investigate the internal structure of subgraphs induced by LS subsets of
graphs. It is clear that LS subsets cannot be characterized
in general by
edge-connectivity
alone, although it is easy to see that if A( G,) > a( H),
then H must be a LS subset of V(G). It would be very interesting
to
identify graph-theoretic
properties
that are characteristic
of subgraphs
induced by LS subsets H for which the induced edge-connectivity
is in
the range ao/2+
1 I X(G,) 2 a(H).
4. Conclusions
The results presented in the preceding section demonstrate
that if LS
sets are used to identify cohesive subsets of a population
with respect to
some (symmetric)
dyadic relationship,
the subgraphs induced by the LS
sets will be rather strongly cohesive. In particular,
they will be relatively
insensitive
to the removal of individual
links, remaining
connected
even
after several links have been removed. LS sets were first proposed
as
models of cohesive subsets in networks because their definition
(and
S.B. Seidman
106
/ Infernal
cohesion oj LS sels
Theorem 2.1) implied that they had more internal than external ties,
thus embodying an intuitive aspect of cohesiveness that has rarely been
that has been demonstrated
here for
explicitly used. The “robustness”
LS sets makes them even more attractive for empirical analyses. A final
judgement
as to the utility of LS sets for empirical analyses must await
a test with real data, and computer
software to make such a test
possible is presently being developed.
One characteristic
of LS sets that makes them particularly
attractive
is that they can be used both for dyadic and non-dyadic
social networks. Unfortunately,
the results on internal cohesion presented
here
do not extend naturally
to LS test in hypergraphs,
unless the edge
structure
of the hypergraphs
is rather rigidly constrained.
Such constraints do not seem to lend themselves
to the situations
in which
hypergraph
models are naturally
applied (see Seidman
(1981a) for
examples).
For the present, then, applications
relying on the internal
cohesiveness of LS sets must be restricted to dyadic networks.
References
Alha, R.D.
1913 “A graph-theoretic
definition of a sociometnc clique” Journal of Marhemarical Socrologv
3: 113-126.
Foster, B.L.
traders in Thai village social networks”. Ethnic Groups 2: 221-240.
1980 “Mmority
Harary, F.
1969 Graph Themy. Reading, Mass. Addison-Wesley.
Killworth,
P.D. and Bernard, H.R.
accuracy in social network data”. Human Organizarron 35: 269-296.
1916 “Informant
Kramer. F.
between the diameter of a graph and the degrees of the nodes of a graph”.
1912 “Relations
Rev. Anal. Numer. Teoria Aproximmei
I: 125% 13 I.
Lankford,
P.
analysis
1974 “Comparative
Lawler. E.L.
1973 “Cutscts and partitmns
F. and Sami M.
1969 “On the decomposltlon
of clique
identification
of hypergraphs”.
methods”.
Nefworks
Sociomefrv
37: 287-305.
3: 275-285.
Luccio,
Trunsoctions
Lute, R.D.
1950 “Connectivity
IS: 169-190.
of networks
on Circuir Theory CT-16:
and generalized
Lute, R.D. and Perry, A.
1949 “A method of matrix
analysis
cliques
of group
into
minimally
Interconnected
networks”
IEEE
184-188.
in sociometric
structure”.
group
structure”.
Psychomerrika
Ps~&~metrr~o
14: 94- 116.
S. 0. Seidman
Mokken. R.J.
1979
Cliques,
Moreno, J.L.
1934
Seidman,
198la
1981b
clubs
and clans”.
/ Iniernal
Quality
cohesion of LS sets
and Quanriry 13: 16 1- 173.
Who W’d/ Survive. Washington,
D.C.: Nervous
S.B.
“Structures
induced by collections
of subsets:
Socral Scrences I: 381-396.
“LS sets as cohesive subsets
107
of graphs
and Mental
Disease
a hypergraph
and hypergraphs”.
Publishing.
approach”.
Paper presented
Mathematical
at the SIAM
conference on the applications
of discrete mathematics,
Troy, NY, 1981.
Seidman, S.B. and Foster. B.L.
1978a “A graph-theoretic
generalization
of the clique concept”.
Jourrml
of Marhe~na~icol
1978b
Watkins.
1967
Socrology 6: 139- 154.
“A note on the potential
for genuine
cross-fertilization
mathematics”.
Sociai Networks I: 65-72.
M.E.
“A lower bound
74: 297.
for the number
of vertices
of a graph”.
between
Amer~an
anthropology
and
MathematmlMonth!y