Structural Analysis on Social Network Constructed from Characters

2442
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
Structural Analysis on Social Network
Constructed from Characters in Literature Texts
Gyeong-Mi Park
BK21 Center for U-Port IT Research and Education, Pusan National University, Korea
[email protected]
Sung-Hwan Kim and Hwan-Gue Cho
Dept. of Computer Engineering Pusan National University, Korea
Email: { sunghwan, hgcho } @pusan.ac.kr
Abstract—Recently we witnessed that the social network
analysis focusing on social entities is applied in the social
science and web-science, behavioral sciences, as well as in
economics, marketing. In this paper we present one method
to construct and structure analysis the social network from
literary fictions by a simple lexical analysis. And we will
show that those literary social graphs, shows the power law
distribution of some features, which is the typical
characteristics of complex systems. We newly proposed the
concept of the kernel of literary social network by which we
can classify the abstract level of protagonists appeared in
fictions. And we also studied the connectivity of social
network based on statement distance distribution of
characters. Our study shows that the metric distance among
characters written in linear text is very similar to the
intrinsic and semantic relationship described by fiction
writers, which implies the proposed social network from
fictions could be another representation of literary fiction.
So we can apply other scientific and quantitative approach
by analyzing the concrete social graph model extracted from
textual data.
Index Terms—Social Network, Complex System, Literary
Fiction, Character Graph, Power law.
I.
INTRODUCTION
Extracting useful information from a large textual
repository is getting essential in data mining field. One
difficulty in this work is how to deal with the various
kinds of natural languages. Most work has based on
English based texts, but recently some other languages
have shown interesting result [9], [11].
Some previous work proposed the open problems on
the dynamic property of social network extracted from
text. In this view of new framework, we have tried to
investigate the temporal structure of social network, since
the literature fiction spans more than years or more than
one human generation such as “War and Peace” written
by Tosltoy.
Our basic idea is that we can regard the complicated text
Manuscript received October 30, 2012; revised November 22, 2012;
accepted November 25, 2012.
This work was supported by the National Research Foundation of
Korea Grant funded by the Korean Government (NRF-2010-371B00008). Corresponding should be addressed to Hwan-Gue Cho
([email protected]).
© 2013 ACADEMY PUBLISHER
doi:10.4304/jcp.8.9.2442-2447
(e.g. long literary fictions) as the typical complex system
such as human society and the huge ecosystem in Earth.
In novel lots of characters are interacting in the text space
which consists of a sequence of words. So if we compute
the distance" of two characters over the linear text, then
the new kinds of social relationship can be extracted. And
we can construct the social network from the distance
matrix of characters appeared in literatures. So we need
to characterize the structure of literary fictions in terms of
complex system.
By analyzing the social network extracted from text,
we can determine the importance of fiction characters by
a mechanical way of text analysis without any
complicated semantic analysis. This will help us to
understand the deep and common structure of literature
regardless of written languages. So mining the
topological structure of social graph from text will help
us to find other important features in a simple graph
analysis and can summarize the whole story or can
measure the complexity of literature fiction by measuring
the degree of interactions among protagonists or between
protagonists and other boundary persons. This implies the
proposed approach will greatly help to reveal the
semantic structure of fictions by constructing the concrete
social network and topological analysis of the graph.
In next section, we will survey the related work on
these topics. Section 3 will devoted to give some
definitions and preliminary concepts. Section 4 will show
the concrete algorithm for constructing the social network
from text. Also we should open how we prepared the
testing data (4 their-person novels) and how to preprocess
to make them fit in experiment. Experiment results are
shown in Section 6, where we can assure that these social
networks are quite similar to the general complex systems.
And finally the summarized conclusion and future open
problems will be exposed.
II.
RELATED WORK
A. Computational Analysis on Literature
It is general that the computer-based literary textual
analysis has typically performed at the word level [1].
This work has focused to discover authorial style and the
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
lexical patterns of word use. This computational
linguistic work has contributed to validate or repute the
basic literature theories that the novel is a literary form
which tries to produce an accurate representation of the
social world. According to that theory, it is believed that
theories about the relation between novelistic form (the
workings of plot, characters, and dialogue, to take the
most basic categories) and changes to real-world social
space.
Recently researchers started to study property of cooccurrence words in natural language space [7], [8], [10],
[12], [13]. Since all words in a text are closed related
though they are arranged in the linear sequence. The cooccurrence pattern of some pair of words may reveal the
important features hidden in some texture [17]. One
interesting application of these co-occurrence analysis
gave the clue to identify spam or vicious replying
message [8], [10]. One application of word appearance
pattern is biomedical text mining whose goal is to find
the biologically and medically meaningful knowledge
only by analyzing the huge textual data(mainly academic
papers)[2], [16]. For example, hidden functions of genes
can be estimated or predicted by biomedical text mining
tools, which helps biologists to make a new drug fast and
efficiently.
B. Extracting Social Networks from Text
Typical social network studies based on textual data
focus on the structured data including email and Tweet
messages and phone SMS messaging texts for social
network construction [6], [8], [14]. The main goal of
these work are to find the most influential person. And
the propagating patterns of texts (messages) are the key
subjects of this work. And pure combinatorial studies has
been started to discover the frequency of some designated
words [1], [12]. One of typical work on this line is the
Zipf's law. In any natural language, The generic Zipf's
law states that given some corpus of natural language
utterances, the frequency of any word is inversely
proportional to its rank in the frequency table. That
implies the most frequent word appears approximately
twice as often as the second most frequent word, three
times as often as the third most frequent word.
Till now, little work has been done on extracting social
relations between individuals from text. One notable
work has published on constructing conversational
networks from literature [3], [10], [12], [16]. Elsan et al
proposed a method for extracting social networks from
nineteenth-century British novels [5]. Link two characters
are given if they are in conversation or not in the text. So
they successfully made the conversational network to
imply the closeness of protagonists. They tried to validate
some literature-related hypotheses. It means they replaced
the social network by conversational network to evaluate
literary theories. In their graph model, the named
characters are represented vertices represent characters
and edges are added to indicate dialogue interaction
among characters. They set the edge weight proportional
to the amount of interaction. The two main constraints of
conversational network are followings. 1. The characters
© 2013 ACADEMY PUBLISHER
2443
should be in the same place at the same time; 2. The
characters should speak turn by turn.
Their conclusion is twofold. First there is an inverse
correlation between the amount of dialogues in a novel
and the number of characters in that novel. Second a
significant difference of social interaction in the 19
century English novel is basically is geographical. One
difference of Elsan's approach and ours is that we do not
have interests in validating any literature theory. Another
interesting work should be noted. Hassan et al have
studied the positive and negative influential social
network by analyzing the sentiment words in literature
[1]. They have used a larger amount of online discussion
posts. Their experiments showed that their method easily
constructed networks from text with high accuracy,
which connects out analysis to social psychology theories
of signed network, namely the structural balance theory.
Our purpose is to give any engineering or
computational method to reveal the hidden structure of
literary fictions. And also we will show the social
network extracted from fictions is also one of complex
system where rich gets richer and poor gets poorer by
showing various statistics gathered in word analysis. The
previous work focused on the static structure of social
network such as degree distribution and connection form.
But this paper also studies how the structure of social
network varies along chapters by chapters.
III.
PRELIMINARY
Since every social network is characterized by entities
(nodes) and the distance function, it is important to define
the valid metric function. In our social network model
from literary text, the node(entity) is textual words
denoting the persons described in the corresponding text.
Then we need how to make them connected by a proper
metric(distance function). We will explain this in the
following.
Let T denote one literary fiction text. Each T consists
of a set of consecutive statements s . And each
statement s can be decomposed of a sequence of words
w , , where w , is the first word of statement s .
And finally all words w , consists of a sequence of letters
. |S | denotes the number of words(is the
,,
statement s . In a similar way let |T| denote the number
of statements in that text T. And we define statement rank
) = i and the word rank,
of word
, , staterank(
wordrank( , ) is define the as followings.
wordrank
,
| |
1,
1
By the preprocessing step, we already prepared the
name lists of all characters appeared in the fiction. In the
following we denote C the set of all characters in T.
Since C also appeared in text with different form, we
need to locate the wordrank().
Now we have to define the distance metric on
statements and words. dist X p, q denote the distance
2444
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
between twoo entities p annd q over X meetric space, where
w
X could be the
t statement space or woord space. Leet me
show a few examples forr this distancce function. Inn the
following leet dist
w , , w . denoote the numbeer of
statement whhich lies in beetween two diifferent statem
ments
s and s . That is defined j i 1 if i
. That meaans if
two words loocate in a sam
me statement, then the stateement
distance betw
ween two words
w
is zero.. And we deefine.
|wordraank Y
dist
X, Y
worrdrank X |
As you exxpect, if two characters X and Y are cllosed
related in story,
s
then
,
would be short
compared to the pair of looosely related characters. So the
o two characcters are expected to depennd on
relatedness of
the average distance of
, alsoo the
,
o
,
w
where
is one
frequency of
threshold coonstant to dettermine if thee person is inn the
working narrrative space.
Let us givve one examplle to help the definitions abbove.
Here we havve a tiny text with
w 5 simple statements. Harry
H
appears two times at stattement 1, 2 which
w
are dennoted
, . And we denote , , for thhree Hedwig from
,
the first stateement. Then we see that
since
0,
,
16
3aand
,
there are 16 words
w
betweeen and
o
over
the wholee text
space. And it is easy to know that workdrad
w
4,
8.
wordrank
#
1
2
TABLE I.
STATEMENT
T DISTANCE SA
AMPLES
word sequence of eaach statement
Haarry walked acrosss Hedwig's cage..
Heedwig and Harry had
h been absent for
f two nights.
3
Duursley worried about Hedwig and Lily.
L
4
Duursley had pretendded for ten years..
5
Beccause Lily and Jaames Potter had not
n died.
IV.
CON
NSTRUCTING
G SOCIAL NE
ETWORK FR
ROM
T
TEXT
A. Construccting Social Neetwork from Literary
L
Text
Now we will
w explain hoow to extract the social netw
work
from literaryy fiction. Let T be a text too be analyzedd and
GT V, E be the
t corresponnding graph coonstructed froom T.
All words reepresenting chharacters are mapped
m
to verrtices
of G, V. Nexxt we decide how to makke each charaacters
(vertices) connnected by asssigning edge relation. We give
the edge connnection as (v
v , v ) if the weight
w
betweeen v
and v , weigght(v , v ), is leess than a threeshold constannt c ,
which is givven by user too meet the ow
wn purpose. Since
S
there are tw
wo distance measures
m
based or statemennt or
word entity, we applied dist
measure for weight
w
function in thhis paper, whiich was decided by comparrative
work againsst dist
. More formaal computatioon is
given. Sincee two charactters X and Y appears
a
more than
once adjacenntly, we find all adjacent pair of X andd Y in
text. The edgge weight of for two charracters X and Y is
given follow
wings.
© 2013 ACADEMY PUBLISHER
weight X, Y
,
2
,
,
, and 0
1, a controll
wheere k
consstant. In this experiment w
we set α 0..7. So if twoo
charracters are inn a statementt, then statem
ment distancee
shou
uld be zero(0)), so the weigght is 0.7. If the statementt
distaance betweenn two words((characters) iss 5, then thee
weig
ght should 0..7
0.16807 . If we set the controll
variaable α 0.9 then,
0
0.59049 . We can get thee
extreemal case if α 1, whichh disregards the effect off
distaant word in weight calculation
n. Next iff
weigght X, Y
then we ggive the con
nnecting edgee
betw
ween Y and Y . In implemenntation, we do
o not considerr
any pair of characcters which arre separated with
w more thann
10 statements.
To summarize, we can connstruct the so
ocial networkk
grap
ph by adjustinng two contrrol variables, , . If wee
increease
or α then we geet the more dense sociall
netw
work structuree and we willl get the sparrse network iff
we lower
l
c0. So generally
g
let uus
, , ,
representt
the social networrk extracted from the liteerature T withh
meters α, .
two control param
Fig
gure 1. Three Soocial Networks exxtracted from ficttions (a) Social
neetwork extracted from Crime and P
Punishment with the threshold
co
onstant ca = 23.56. (b) Another soocial network from
m \Crime and
Puniishment with morre a loose threshoold cb = 23.56.con
nstant compared
to case
c
(a). (c) We can
c get the nearlyy complete social network of the
ficction with a threshhold cb = 23.5
B. Kernel
K
Constrruction for Litterary Social Network
N
Since all fictioons consists oof a few protagonists andd
p
it is interesting to
o observe thee
otheer boundary persons,
topo
ological propeerty of each ccharacter according to thee
role importance in
i the story. S
So we propose the conceptt
n
kerneel which refleects the role importance
i
off
of network
literary text spacee. In the follow
wing we show
w one examplee
social network G__T (V,E,α,c_00 ).
Figu
ure 2. G_T (V,E
E,α,c_0 ), the origginal Social Netw
work Graph with
21 characters,
c
that iss Kernel_0 (G,t)
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
We call the
t 1
whose graph
g
degree is 1. So theree are 9 1
whho can be considered some
s
, , , , , , , ,
boundary perrsons in fictioon since their interaction
i
is quite
low. Now we will exxplain how to constructt the
n
, constructively
c
y and
, of social networks
successively.. We set
, , ,
for base
condition. And
A we denotee (k+1)-th Kernel as
which was derived
d
from
, . The proceedure
is simple. We delete all veertices of
w
, . whose
e
to t. So
graph degree is less than or equal
, should be
b smaller thann
, .
When we geet
,
, , wee say
the kernel is stabilized. In the fol-lowing Figure, we show
s
,1 .
2445
In
n this paper we
w experimennt the connecttedness effectt
by controlling
c
thee influence region. We hav
ve check threee
more statement foorward in the previously deepicted sociall
ph, which means the influeence region is 3. Now wee
grap
will try to find the connecteedness by co
ontrolling thee
uence regionn 0,1, to 10. Ideally speaking,
s
alll
influ
charracters referredd in a fiction sshould be con
nnected unlesss
that fiction conssists of indeependent chap
pter such ass
nibus essay. Itt is very intereesting to find the minimum
m
omn
influ
uence region which guaranntees the conn
nectedness off
social graph extraacted from a teext.
n the followinng Table, we show the component sizee
In
with
h respect to the
t influence region d 0, 1, 2, 3, 4, 5 ,
wheere d impliees we only considered the co-occurrence
c
e
charracters in a single statemeent. Experimeent shows thee
miniimal region depends
d
on tthe languagess and writingg
stylee and genree. It would be interestin
ng topic thee
variaations on inffluence regionn for graph connectedness
c
s
and it implicationn.
TAB
BEL III THE NU
UMBER OF COM
MPONENTS BY STATEMENTS
S
DISTAN
NCE
Title
wing all degree 1 nodes(boundary
n
n
nodes)
Figure 3. Kernnel_1 (G,1) show
are removed from
f
Kernel_0 (G
G,t). The shaded nodes
n
{i,n,d} willl be
removed if we coonstruct Kernel_22 (G,1).
V.
EX
XPERIMENT
TS
A. Data Prep
eparation
Now we will
w explain how
h
to locate the word possition
of fiction chaaracters. Firstt of all we havve obtained thhe list
of characterss appeared inn example tessting fictions from
Wikipedia annd other interrnet sources. And it shoulld be
noted that thhere is an auuxiliary bookklet explaininng all
characters apppeared in Koorean novel "T
The Earth", where
w
more than 500 persons arre listed withh brief explannation
on the role and relationnship to otheer characters and
protagonists.
T
TABEL
II TEST DATA NOVEL
L TEXT
Title
Language Words
Three Kingdooms Korean
642,222
1,446,87
The Earth
Korean
9
War and Peacce
Harry Potter
Stattements Characteers
1221,779
1, 2885
1776,387
515
English
584,618
30,912
1339
English
1,279,37
5
1226,112
6555
B. Connecteedness of Social Graph accoording to
Influencee Region
Let we havve statemennts from a textt. As we explaained,
if a characteer
appears in
statem
ment, then we will
check other characters whhich appear inn next statem
ments.
ments of forrward statem
ments
The numbeer of statem
determines the
t graph connection. If we look aheead a
smaller stateement forwardd, the social connection
c
shhould
be sparse, evven may be disconnected.
d
L the numbber of
Let
statements too be looked forward
f
be thee influence reegion
for a characteer of .
© 2013 ACADEMY PUBLISHER
d=0
d=1
d=2
d=3
d=4
d=5
Waar and Peace
22
9
5
4
2
1
Haarry Potter
74
20
4
4
2
1
Th
he Earth
67
33
23
21
18
14
Th
hree
Kiingdoms
92
21
7
5
2
1
C. Connected
C
Coomponent Anaalysis on Edge Weight
We
W are interestted in the connnectedness off whole sociall
grap
ph with respecct to the edge w
weight. If we only considerr
the “strong" edgge, then wee would get the lots off
mponent sincee some strong
g local groupp
disconnected com
uld be conneccted without aany link edgees connectingg
wou
two different soccial groups. S
So by controllling the edgee
ght we can gett the locally reelated group. Also we havee
weig
to focus
fo
on the number
n
of coonnected com
mponents withh
respect to the edge weight conttrol.
According
A
too our experriment, the number off
com
mponents are inncreasing fastt initially with
h lower valuee
of cut
c weight , which enablles us to estim
mate the “thee
link characters” who
w connects two differentt local group..
We think this prooperty would bbe one importtant feature too
distiinguish the ploot style of anyy writers.
Figure 4. Thhe number of com
mponents by edgee weight
2446
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
D. Variationn on Maximum
m Component Size and Its
implicatioon
connect
c
the different local ggroups automaatically.
Figure 5.
thee number of nodee maximum compponent node accorrding
edge weight
We definee the maxim
mum componeent, the conneected
component whose num
mber of noddes are maxximal
compared to other componnents. Generaally we believve the
mponent woulld contain the main protagoonists
maximal com
of the fictiion. We tried to identiify the maxximal
component of
o intermediatee social netwoork for a giveen cut
value. We found
fo
that thee maximal coomponent of some
s
fictions was not reduced with respect to the cut weeight.
mplies that thee core groupss of characterrs are
That fact im
highly relateed compared to the bounddary persons along
a
whole storyy. This variation of the size of maxximal
component would
w
show thhe interestingg characteristics of
novel plot.
VI.
B. Future
F
work and
a Open Probblems
Now
N
we are trying
t
to reveeal the dynam
mic of sociall
netw
work in transsition. For exxample thoug
gh the fictionn
"Thee Romance off Three Kingddoms" has mo
ore than 10000
charracters referred, the numberr of working characters
c
aree
boun
nded for a fiix size text w
window, whicch means thee
social network is so dynamic. IIf we measure the degree off
hat would bee
dynaamic developpment of ficttion story, th
interresting to classsify fictions. Or we hope to
o characterizee
the topological structure
s
of tyypical detectiive story andd
me story or fanntasy novel.
crim
C
CONCLUSIO
ON AND FUT
TURE WORK
A. Contribution Summaryy Experiment Summary
S
In this papper we considdered how to extract the social
s
network from
m literary texxt. We can conclude thaat the
simple lexiccal structure of literary such charaacters
appearance clearly
c
isomorphic to the semantic struucture
of the fictionn. Another intteresting fact we revealed is
i the
topological structure
s
of soocial network preserves
p
the most
important annd crucial common
c
feattures of com
mplex
system such as the generral social (hum
man) networkk and
p
and viruuses.
the huge biollogical networrks including proteins
• Structuraal analysis onn Social grapph extracted from
literary fiiction gives innteresting factts which help us to
characterrize the semaantic structuree of fictions.. For
example the variatioon on the maximal
m
size of
component would cleaarly shows thhe strength off core
o
whole stoory.
group of protagonists over
work shows quite
• The extrracted literaryy social netw
similar structure
s
in terms
t
of lexxical level too the
narrative structure of fictions
fi
intrinssically.
s
• Our expeeriment clearlyy showed that this literary social
network is
i one kind off complex systtem where moost of
properties are under poower law.
• Also thee number off connected components with
respect too the cut valuue for edge weight
w
impliess that
the sociaal relationshipp among mainn characters is so
higher thhan those off boundary persons.
p
Andd the
connectedd component will help uss to find the local
social grooup and the inntermediate linnking personss who
© 2013 ACADEMY PUBLISHER
REFEREN
NCES
[1] A. Hassan, A.
A Abu-Jbara and D. Radeev. “Extractingg
Signed Sociall Networks Frrom Text,” in
n Proc. of thee
TextGraphs Workshop
W
at ACL
L, 2012, pp.4-12.
[2] S.-Y. Bong annd K.-B. Hwanng. “Keyphrasee extraction inn
biomedical puublications usinng mesh and intraphrase wordd
co-occurrence information,” iin Proc. of the ACM
A
DTMBIO
O,
2011, pp.63-666.
“
myth aas a complex network,” Thee
[3] Y.-M. Choi. “Greek
Korean Physiccal Society, vol.. 49, no. 3 pp. 298-302,
2
2004.
L and P. C. Fernando, “Siimilarity-Basedd
[4] I. Dagan, L. Lee
Models of Word
W
Cooccurreence Probabilitties,” Machinee
Learning, vol. 34 no.1-3, pp. 43-69, 1999.
[5] D. K. Elsonn, D. Nicholaas, and K. R.
R McKeown,,
“Extracting Soocial Networks from Literarry Fiction,” inn
Proc. of the Association
A
for Computation
nal Linguistics,,
2012, pp. 141--147.
[6] C. L. Freemaan, The Deveelopment of Social
S
Networkk
Analysis: A Sttudy in the Socciology of Scieence, Empiricall
Press, 2004.
Y
andd T. Ito, “Filtering harmfull
[7] Y. Fujii, T. Yoshimura
sentences baseed on three-worrd co-occurrencce,” in Proc. off
Collaboration,, Electronic meessaging Anti-Abuse and Spam
m,
2011, pp. 64-772.
Kahng and D.Kiim, “Modellingg
[8] D.-H. Kim, G.. Rodgers, B.K
hierarchical annd modular com
mplex network
ks: division andd
independence,”” Physica A: SStatistical Mecchanics and itss
Applications, vol.
v 351, no.2-44, pp. 671-679, 2005.
[9] S.-R. Kim, “Coomplex networrk analysis in litterature: Togi,””
The Korean Physical
P
Societyy, vol. 50, no. 4,
4 pp. 267-271,,
2004.
nce of tags andd
[10] R. Krestel andd L. Chen, “Ussing co-occuren
resources to identify
i
spamm
mers,” in Procc. of Machinee
Learning andd Principles aand Practice of Knowledgee
Discovery, 20008, pp. 38-45.
[11] Y.-K. Lee, J.. Ku and H. Kim. “Analyssis of networkk
dynamics from
m the romance of the three kingdoms,”
k
Thee
Journal of Koorea Contents Association, vol.
v
9, No. 4,,
pp.364-371, 20009.
[12] X. Luo and A.
A Zincir-Heywoood. “Combining word basedd
and word co-ooccurrence baseed sequence an
nalysis for textt
categorization,,” in Proc. off the Machine Learning andd
Cybernetics, vol. 3, 2004, pp. 1580-1585.
[13] C. Monojit, C.. Diptesh and M
M. Animesh, “G
Global topologyy
of word co-occcurrence netw
works: beyond the
t two-regimee
power-law,” in
i Proc. of Innt. Conf. on Computationall
Linguistics, 20010, pp. 162-1700
[14] R. Lambiotte.., M. Auslooss and M. Thelwall. “Wordd
statistics in Blogs
B
and RSS feeds: Tow
wards empiricall
JOURNAL OF COMPUTERS, VOL. 8, NO. 9, SEPTEMBER 2013
universall evidence,” Jouurnal of Inform
metrics, vol. 1, no.
n 4,
pp.277-286, 2007.
[15] J. Rydbeerg-Cox. “Sociaal Networks annd the Languaage of
Greek Trragedy,” Journnal of the Chiccago Colloquiuum on
Digital Humanities
H
and Computer Scieence, vol. 1, no. 3, pp.
1-11, 20111.
[16] J. Stillerr, D. Nettle andd R. I. Dunbar, “The small world of
shakespeare's plays,” Huuman Nature, vol.
v 14, no. 4 ppp.397408, 20033.
[17] Y. Fujii, T. Yoshimuura and T. Ito.. “Filtering haarmful
sentencess based on three-word co-occuurrence,” in Proc. of
the 8th Annual
A
Collabooration Electronic messaging AntiAbuse annd Spam Conferrence, 2011, pp. 64-72.
Mi Park is at BK21
B
Center for
f UGyeong-M
Port IT
T Research annd Education. She
receivedd the B.S. degrree from the Korean
K
Nationall Open University, Korea, annd the
M.S. annd Ph.D. degreees from Pukyyoung
© 2013 ACADEMY PUBLISHER
2447
Natio
onal Universityy, Korea. Her reesearch interestts are computerr
visio
on and informattion retrieval.
Sung-Hwan Kim is a M.S. sttudent in Pusann
Naational Univerrsity. He receeived the B.S..
deegree from Pussan National University.
U
Hiss
research interestss are informatio
on retrieval andd
seequence processsing
Hwan-Gue Ch
H
ho is a Professor in Pusann
N
National
Univerrsity. He receeived the B.S..
deegree from Seooul National Un
niversity, Koreaa,
annd the M.S. annd Ph.D. degreees from Koreaa
A
Advanced
Insstitute of Science andd
Technology, Korrea. His researcch interests aree
coomputer algorrithms and bioinformatics..