Analyzing Shakespeare`s Plays From A Network Perspec ve

Analyzing
Shakespeare’s
Plays
From
A
Network
Perspec:ve
Vikas Thotakuri and Sanjukta Bhowmick
Department of Computer Science. University of Nebraska Omaha
INTRODUCTION
Social networks are generally modeled on only one
type of relation. Groups are open-ended. Time
frame may not cover significant events and their
effect.
8
9
7
How would the analysis change if we model the
interactions and relationships of a closed group,
over significant incidents ?
Difficult to obtain real life data, because of the time
commitment and privacy constraints. 5
4
BENVOLIO
3
NURSE
1
LENNOX
6
FRIAR
2
ROSS
7
CAPULET
MACBETH
5
BANQUO
4
TYBALT
3
MERCUTIO
2
JULIET
1
0
MALCOLM
LADYMACBETH
MACDUFF
DUNCAN
0
2
5
9
8
ROMEO
6
1
Next best option: Analyze fiction, which would give
an indication of social relations. --------------SCENE I.--------------[ORLANDO, ADAM] : 32
[ORLANDO, ADAM, OLIVER] : 74
[DENNIS, OLIVER] : 11
[CHARLES, OLIVER] : 78
--------End of SCENE---------------------
Interaction Networks. Connect two characters if they appear in the same scene. Edge weight is the number of lines spoken. Undirected.
Metrics: Degree: Number of different characters that share the scene
Betweenness Centrality: Connecting nearly non-interacting groups of characters
Eigenvector Centrality: Influence of character based on number of lines spoken
3
1
2
8
NERISSA
7
LAUNCELOT
6
GRATIANO
5
LORENZO
4
BASSANIO
3
JESSICA
2
PORTIA
1
ANTONIO
0
SHYLOCK
3
1
2
3
9
8
4
3
2
1
DONPEDRO
7
BENEDICK
6
LEONATO
5
CLAUDIO
4
MARGARET
3
HERO
2
BEATRICE
1
0
ORLANDO
TOUCHSTONE
ROSALIND
DUKESENIOR
JAQUES
CELIA
OLIVER
SILVIUS
0
1
2
3
1
2
3
Observations. Relative rank of characters not consistent across the three metrics
Characters known to be important do not always have high ranks and vice-versa
Female characters have consistently low rank—except when they disguise as male
KEY OBSERVATIONS
SHAKESPEAREʼS PLAYS
Well studied. Both ground truth and controversy
available
Different types of characters, not all equally
important
‐‐‐‐‐‐‐‐‐‐‐‐‐‐SCENE
I.‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
ORLANDO
:
[ADAM,
JAQUES]
ORLANDO
:
[ADAM]
OLIVER
:
[CHARLES]
OLIVER
:
[CHARLES]
OLIVER
:
[ROSALIND]
CHARLES
:
[ORLANDO]
OLIVER
:
[CHARLES]
‐‐‐‐‐‐‐‐End
of
SCENE‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
Mentioning Networks. Connect two characters if one is mentioned by the other. Edge weight is the number of mentions. Directed.
Metrics: Weighted In-Degree: Number of times character is mentioned
Weighted Out-Degree: Number of times character mentions someone
Page Rank: Importance based on number of character is mentioned
Different types of scenario. Many significant events
Represents both social pattern as well as narrative
patterns
8
9
6
5
Open-source formatted text available
http://shakespeare.mit.edu
Can we identify important characters in the
plays based on network properties ?
4
3
2
1
JULIET
7
ROMEO
6
CAPULET
5
NURSE
4
BENVOLIO
3
TYBALT
2
MERCUTIO
1
0
ROSS
LENNOX
MACBETH
BANQUO
MALCOLM
LADYMACBETH
MACDUFF
DUNCAN
0
1
2
3
1
2
8
BASSANIO
7
PORTIA
6
ANTONIO
5
LORENZO
4
SHYLOCK
3
GRATIANO
2
JESSICA
1
NERISSA
0
LAUNCELOT
3
1
2
3
Mentioning network shows the plot context
(important people mentioned more often; socially
peripheral people mentioned less often; female
protagonist gets high mention in romantic stories)
False negatives come up for hidden influences
Lady Macbeth, Shylock, Celia
7
6
9
8
7
Interaction networks shows the social context;
(women less important, messengers high
betweenness centrality)
6
5
CLAUDIO
4
3
4
JAQUES
BENEDICK
3
OLIVER
SILVIUS
2
BEATRICE
MARGARET
1
ROSALIND
HERO
LEONATO
2
ORLANDO
5
CELIA
1
TOUCHSTONE
0
0
1
2
1
3
2
TAKEAWAYS
3
Multiple relations provide more accurate picture of
the social network
Observations. Relative rank of characters not consistent across the three metrics
Generally important characters have high rank—particularly for females in romantic plots
Important characters have low rank if they are outside the social sphere (non-conformity)
More interaction not necessarily metric of
importance
Distribution of interaction and importance shows
pattern of social relations
Synthesis. Rank characters based on the metrics (higher value=larger rank) .
Compute average of three metrics for each type of network
Three categories: High top 30% Low bottom 30% Medium Rest
Men:oning
Interac:on
Type
High
High
Ac@ve
Protagonist
Enablers/ Drivers are more hidden than
protagonists
High
Low
Medium
High/Medium
Suppor@ng
Medium
Low
Suppor@ng/Redundant
Passive
Protagonist
Low
High/Medium
Connector/Enabler
Low
Low
Redundant
FUTURE DIRECTIONS
Discover parameters to uncover hidden influences
Quantifying relations based on ratio of noun to
pronouns/epithet
LIA
CE
E
R
NIO
ST
ON
SE
KE
DU
SIL
VIU
S
CH
TO
U
ES
IV
ER
OL
JA
QU
OR
LA
ND
O
RO
SA
LIN
D
T
M
AR
GA
RE
E
RO
IC
ED
NP
BE
AT
R
DO
K
AT
O
DIC
LE
ON
IO
RO
HE
AU
D
BE
NE
A
Observations. Protagonists are identified clearly. Most enablers also identified
Important False Negatives: Lady Macbeth and Shylock (rarely mentioned by name)
Multiple locations/ Longer time contribute to more diversity in types
CL
SA
LO
T
CE
JE
SS
IC
LA
UN
K
RIS
NE
YL
OC
SH
IO
NO
TO
N
AT
IA
AN
GR
IO
ZO
RT
IA
EN
LO
R
PO
ET
H
SA
N
BA
S
AN
AC
B
RO
SS
NC
DU
LA
DY
M
ET
H
M
AL
CO
LM
M
AC
DU
FF
LE
NN
OX
BA
NQ
UO
AC
B
M
IA
R
TIO
FR
ER
CU
M
EO
LE
T
BE
NV
OL
IO
NU
RS
E
TY
BA
LT
PU
JU
L
CA
RO
M
Acknowledgements
College of IS&T University of Nebraska at Omaha
NSF-RET (Research Experience for Teachers)
IE
T
Other datasets: Movie scripts. Show change in
society over time
Real world datasets: Collaboration + Citation.