Retrieval approach to extract opinions about people from resource

Retrieval approach to extract opinions about people from
resource scarce language News articles
Aditya Mogadala
Vasudeva Varma
Search and Information Extraction Lab, IIIT-H
Hyderabad
India
Search and Information Extraction Lab, IIIT-H
Hyderabad
India
[email protected]
ABSTRACT
We wish to address the challenging task of opinion mining
about organizations, people and places from different languages. It is known that resources and tools for mining opinions are scarce. In our study, we leverage comparable news
articles collection to retrieve opinions about people (opinion targets) in resource scarce language like Hindi. Opinions expressed about opinion targets (Named Entities)given
by adjectives and verbs known as opinion words are extracted from English collection of comparable corpora to
get transliterated and translated to resource scare languages.
Transformed opinion words are then used to create subjective language model (SLM) and structured opinion queries
(OQs) using inference network (IN) for retrieval to confirm
the opinion about opinion targets in documents. Experiments have shown that OQs and SLM with IN framework
are effective for opinion mining tasks in minimal resource
languages when compared to other retrieval approaches.
Categories and Subject Descriptors
H.3 [Information Storage and Retrieval]: Information
Search and Retrieval
General Terms
Languages, Experimentation
Keywords
Opinion Mining, Subjectivity Analysis, Resource scare languages
1.
INTRODUCTION
News articles are written in different languages containing
different entities like places, people or organizations. Factual
information provided about these entities (mainly people) is
sometimes added with support or expressions of the writer.
This induces opinion about people in the article indirectly.
Permission to make digital or hard copies of part or all of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, to republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee.
WISDOM’12, August 12 2012, Beijing, China Copyright 2012 ACM 9781-4503-1543-2/12/08...$15.00
[email protected]
Opinions can be present in various granularities such as a
word, a sentence, a text etc. Although each granularity
is important, we focus our attention on word-level opinion
detection. For Example, Figure 1 highlights people (Opinion
Targets) in green and opinions expressed about them in red
at word-level from English and Hindi news articles.
Mining opinions about people in news articles [1] will help
us to distinguish the factual information and writers view
on them. Also, it will help us understand the bias of news
agencies or writers towards them. Comparison of opinions
about people mentioned in different news agencies articles
can be done to see the authenticity of information. It also
helps us tracking opinions about people over different timelines.
Among the different problems observed in news articles,
one that caught our attention is the fact that they contain
unintelligent/unclear/misinterpreted opinions. We know that
opinions are generally expressed in subjective text. There
is some evidence from previous works [5] that 44% of sentences in a news collection are subjective. Hence, subjective
text categorization from objective text at sentence level will
help us in mining opinions. However, most of the techniques
[18] which identify subjectivity at sentence level are supervised and require lot of labeled data. This creates a problem
for languages other than English due to availability of less
linguistic resources. Also, identification of people (opinion
target) and opinions about them requires state of art word
identification tools. But, resource scarce languages lack high
precision state of art natural language processing tools like
named entity recognizers, part of speech (POS) taggers, dependency parsers, semantic role labeling tools etc.
So, major problems that this paper tries to address are (1)
can opinion extraction about people in Hindi news articles
is achieved using minimal language specific resources and
tools. (2) can we leverage on resource rich languages English using a comparable corpora. (3) can we surpass word
identification tools in resource scarce languages to identify
named entities, adjectives and verbs [14] used for subjective
text. (4) can we design an approach which is less dependent
on language specific tools and is easily scalable to different
languages. (5) Build a reusable dataset, to facilitate followup research.
The remainder of the paper is organized as follows. In
the next section 2, we describe related work. In Section 3
approach is described and in Section 4 query formulation is
explained. We present our experimental setup and results
in Section 5 and Section 6 respectively. We present result
discussion in Section 7 and conclude with a discussion of
3.
Figure 1: Example of sentences with Opinions and
Opinion Targets in Comparable news Corpora
future work.
2.
RELATED WORK
Opinion mining can be broadly categorized into classification and retrieval methods.
2.1
Classification Methods
In our study we found an approach to extract opinion
holders and relevant opinions, it uses parsing and Maximum
Entropy ranker methods based on parse features. [12] However the problem we face with this approach is that it is
not easily portable to resource scarce languages due to low
quality parsers. The second approach [13] collected opinion
words labels manually and used them to find opinion-related
frames (FrameNet) which are then semantic role labeled to
identify fillers for the frames. However we cannot apply this
approach as well due to non-availability of semantic role labeling tools in resource scarce languages.
2.2
Retrieval Methods
Traditional opinion retrieval systems in TREC [21, 7] tried
on large blog collections extract opinions about a query.
These systems are designed for retrieving opinions about
a single entity i.e. a query. Using this systems on news articles might fail because of the presence of multiple entities
with different opinions expressed on them. Some other approaches [11] used opinion weights and proximity of terms to
the query directly. They considered proximity-based opinion density functions to capture the proximity information.
But these approaches may not be effective in retrieving information from resource scarce language documents due to
dependency on language specific features. Some systems
developed for languages Romanian [2], Chinese [19, 22] ,
Japanese [20, 4] only concentrate on identification of subjective texts. Other multilingual systems participated in
NTCIR-6 [8] opinion extraction track were dataset specific
and their approaches cannot be easily ported to other languages.
This shows a need for better and new approaches for opinion mining in resource scarce languages like Hindi to overcome the above issues.
APPROACH
In order to achieve opinion extraction about opinion targets, retrieval approach is chosen to overcome the problems
involved in availability of state of art NLP tools and resources. We leveraged resource rich language like English to
extract opinions about people from resource scarce language
like Hindi. Initially, Named entities, adjectives, verbs from
English collection are extracted using named entity recognition tools and POS taggers. It is then transliterated and
translated to other languages using conditional random field
approach [15] and bilingual dictionaries respectively. Next,
cross language ported words are used to identify subjective
sentences to create a subjective language model (SLM) and
opinion queries in Hindi. Inference Network(IN) framework
which support proximity and belief between query words is
then used to create structured opinion queries (OQs) with
SLM similar to Language Model (LM) with IN [3] for retrieval of documents to confirm the presence of opinions
about opinion targets in documents.
In the following sections detailed description for achieving
subjective sentence extraction, subjective language model
(SLM) and extension of SLM with IN for opinion retrieval
is described.
3.1
Subjective sentence extraction
Subjective sentences were selected in the document using two approaches motivated from [10]. But, we consider
named entities also to find opinion targets. Documents without subjective sentences are eliminated. Two different approaches are used to extract subjective sentences.
Method 1
If two or more strong subjective expressions occur in the
same sentence mainly Named entities, Adjectives or Verbs,
the sentence is labeled strongly subjective.
Method 2
If at least one of Named Entities or Adjectives or Verbs exist
in a sentence then it is labeled as weakly subjective sentence.
Difference between performance of Method 1 and Method
2 can be observed in the experiments when OQs are used
for retrieval.
3.2
Subjective Language Model for Opinion
retrieval
Subjective language model (SLM) is created for resource
scare language documents similar to language model (LM)
[6] by selecting subjective sentences in the documents
using two different methods mentioned earlier.
SLM approach to opinion retrieval is finding probability
of an opinion query oq being generated by a probabilistic
model based on a document D denoted as p(oq|D). It is done
by estimating the posterior probability of document Di and
opinion query oq using Bayes formula given by Equation 1.
p(di |oq) ∝ p(oq|Di )p(Di )
(1)
where p(Di ) is prior belief that is relevant to any opinion
query and p(oq|Di ) is the query likelihood given the document Di , which captures the particular opinion query oq
information. p(Di ) is considered to be multinomial distribution. This assumption helps in choosing better smoothing
techniques which is mentioned later.
For each document Di in the collection c, its subjective
language model defines the probability p(ow1 , ow2 , ..., own |Di )
containing opinions and opinion targets given by ow1 , , own
as sequence of n query terms. Documents are ranked according to this probability.
The probabilities of the opinion words owi in document
Di improves the weight of subjectivity in the document
Di . Equation 2 gives probability p(owi |c) of finding opinion words owi for entire collection c while Equation 3 gives
probability p(owi |D) only for a document D.
p(owi |c) =
cf req(owi , c)
Σn
i=1 cf req(owi , c)
(2)
p(owi |D) =
tf req(owi , D)
Σm
i=1 tf req(owi , D)
(3)
Here, cf req(owi , c) represents collection frequency of owi in
the collection c and tf req(owi , D) is term frequency of the
owi in a document D. n is total opinion words in collection, while m is total opinion words in a document. The
non smoothed model gives maximum likelihood estimate of
relative counts. But if the word is unseen in the document
then it results in the zero probability. So the smoothing is
helpful to assign a non-zero probability to the unseen words
and improve the accuracy of word probability estimation in
general. In this paper, we used Dirichlet and Jelinek-Mercer
smoothing to assign non-zero probabilities to unseen words
in the documents and collection. Below are the two smoothing techniques that are used to remove the zero probability
scores to unseen words as mentioned in [23].
Dirichlet Smoothing
As subjective language model prior is considered to be multinomial distribution, for which the conjugate prior for Bayesian
analysis is the Dirichlet distribution. We choose the parameters of the Dirichlet to be µ and the values that needs to
be added in the numerator of the Equation 3 for smoothing. Equation 4 gives values which is Dirichlet parameter
multiplied with probability of each opinion word in collection. So after smoothing, the Equation 3 is converted into
Equation 5.
µp(ow1 |c), µp(ow2 |c), ......., µp(own |c)
pµ (owi |D) =
tf req(owi , D) + µp(owi |c)
Σm
i=1 tf req(owi , D) + µ
(4)
(5)
Jelinek-Mercer Smoothing
In Jelinek-Mercer smoothing approach, we consider the mixture of document model p(owi |D) and collection model p(owi |c)
as used in standard retrieval model. This approach takes parameter λ which needs to be estimated. Equation 6 gives the
mixture model equation.
pλ (owi |D) = (1 − λ)p(owi |c) + (λ)p(owi |D)
(6)
Subjective language model with Method 1 and Method 2
forms our extended baseline from basic language model. We
will further extend this model with inference network
which allows different types of structured query formulations as explained below.
3.3
Subjective Language model with Inference
network for Opinion retrieval
Figure 2:
SLM
Inference Network Representation for
This retrieval model combines SLM with IN [17] to confirm the presence of opinion about opinion targets. This
model allows opinion queries containing possible opinions
on opinion targets to use proximity and belief information
between query terms similar to Indri [16]. We observe in IN
framework that documents are ranked according to probability p(I|D, α, β) using the belief information need I calculated using a document D and hyper parameters α and β as
evidence.
Information need I in our scenario is simply a belief node
that combines all of the belief nodes owi ’s containing opinion evidence within the inference network into a single
belief. In our scenario, belief nodes owi ’s are opinion words
in the document. In order to obtain evidence of owi ’s, representation concept nodes roi ’s are used. roi ’s are binary random variables representing only opinion word unigrams from
the total features extracted in the document representation.
Features here are all word unigrams w1 ....wn present in a
document. In order to find the individual belief nodes owi ’s
we need to calculate p(roi |D) and then combine owi ’s to
get information need I. Figure 2 shows the representation.
Documents are then ranked accordingly using p(I|D, α, β).
To achieve it, we assume each subjective document Ds to
be in multiple-Bernoulli subjective model θs and not multinomial distribution as assumed in previous section. As this
model assumption imposes concept nodes roi ’s to be independent. First, we compute p(θs |Ds ) which is the model
posterior distribution given by Equation 7.
p(θs |Ds ) = R
p(Ds |θs )p(θs )
p(Ds |θs )p(θs )dθs
θs
(7)
Where p(Ds |θs ) is the likelihood of generating Ds from model
θs and p(θs ) is the model prior. We see this posterior probability p(θs |Ds ) is distributed according to multiple-Beta
(αro , βro ) because beta distribution is the Bernoulli’s conjugate prior.
In this IN framework only opinion terms are considered
denoted by optfro out of total terms in a document Ds for
independent roi ’s. This is like finding x positive results for
n trials. Thus p(Ds |θs ) distribution is changed to multipleBeta(αro + optfro , βro + |Ds | − optfro ) containing opinion
terms, where |Ds | is the total opinion word count of the
document Ds . Expression 8 give estimate of p(ro|Ds ) representing individual beliefs owi ’s. It is nothing but a expec-
tation over the posterior p(θs |Ds ).
Z
p(ro|Ds ) = p(ro|θs )p(θs |Ds )dθs
Label
Opinion Queries
UQ
NE OExp
SQ 1
#10(NE OExp)
SQ 2
#filreq(NE #Combine(NE OExp))
SQ 3
#filreq(NE #uw10(NE OExp)
SQ 4 #filreq(NE #weight(2.0 #uw10(NE OExp)))
SQ 5
#uw10(NE OExp)
SQ 6
#uw5(NE OExp)
(8)
But we know the expectation of a beta distribution given
α
. Therefore, given that
in terms of its parameters is α+β
p(Ds |θs ) is also distributed according to multiple-Beta(αro +
optfro , βro +|Ds |−optfro ) the Equation 8 is now represented
by Equation 9.
p(ro|Ds ) =
optfro,Ds + αro
|Ds | + αro + βro
Table 1: Opinion Queries
(9)
Label
UQ
So for document Ds , optfro,Ds is the opinion term frequency
with roi ’s as features. Thus subjective language model θs is
estimated based on hyper parameters αro and βro combined
with the observed document Ds . From these models, concept nodes qi ’s are used forming an opinion query. Overall
belief I from this opinion query is used for ranking subjective
documents.
But there can be chances of zero probability and data
sparseness in the model. In order to eliminate this problem
we employ smoothing.
Dirichlet Smoothing
Dirichlet smoothing is done in order to handle zero probability. αro and βro values were chosen given by Equation 10
and Equation 11 respectively to modify Equation 9 to Equation 12. p(ro|c) gives beliefs of the representation concept
nodes in entire document collection.
αro = µp(ro|c)
(10)
βro = µ(1 − p(ro|c))
(11)
optfro,Ds + µp(ro|c)
(12)
|Ds | + µ
where µ is free parameter and optfro,Ds is the opinion terms
frequency for feature ro in Document Ds .
p(ro|Ds ) =
Jelinek-Mercer Smoothing
This method involves a linear interpolation of the maximum
likelihood model with the collection model, using a coefficient λ.
optfro,Ds
p(ro|Ds ) = (1 − λ)
+ λp(ro|c)
(13)
|Ds |
Thus, this model combines SLM and IN with different smoothing techniques used for opinion retrieval. Queries formed for
retrieval are mentioned in next section.
4.
QUERY FORMULATION FOR RETRIEVAL
We understood from the previous section that belief nodes
owi ’s combine evidence from the concept nodes roi ’s to estimate the belief that a opinion words are expressed in a
document. But actual arrangement of the owi ’s and the way
they combine evidence is dictated by the user through the
query formulation. So we form structured opinion queries
(OQs) to identify opinion targets and opinions in resource
scare language news articles. These structured OQs contain
named entities, adjectives and verbs as opinion words. OQs
use proximity and beliefs between query words.
We will see difference in query formulation of opinion
queries which use proximity and belief between query words
and that don’t use below.
SQ 1
SQ 2
SQ 3
SQ 4
SQ 5
SQ 6
Opinion Queries Explanation
Unstructured query containing
OT (named entity) and opinion expressed (OExp).
Match OT and OExp in ordered text within
window of 10 words.
Evaluates combined beliefs of OT and OExp
given OT in document.
Match OT and OExp in unordered text within
window of 10 words given OT in document.
Match OT and OExp in unordered text
within window of 10 words with extra weight
of 2.0 and OT given in document.
Match OT and OExp in unordered text
within window of 10 words.
Match OT and OExp in unordered text
within window of 5 words.
Table 2: Explanation of Opinion Queries containing Opinion Target(OT) and Opinions Expressed
(OExp)
4.1
Opinion queries without proximity and belief
Query terms in OQ are treated as bag of words. Each
word in the opinion query is assumed independent without
any conditional information. This query model may find relevant documents to query terms but may not extract opinions about opinion targets. In order to rank documents,
opinion query likelihood is calculated using subjective language model θs given by Equation 14.
p(oq1 , oq2 |θs ) = Π2i=1 p(oqi |θs )
(14)
where oq1 and oq2 represent opinion targets and opinions on
them respectively.
4.2
Opinion queries with proximity and belief
Indri Query language1 is used to form OQs. In this approach, only those query operators are selected which use
proximity and belief information in their representations.
Proximity representations are used to map the opinion targets and opinions expressed on them appearing within ordered or unordered fixed length window of words. To use
belief information of opinion words of OQ represented as
belief nodes of subjective documents. Opinion words are
combined using a belief operator.
Size used for proximity window is intuitive. Each query
represented in this framework uses SLM with IN for opinion
retrieval. OQs used for retrieval are mentioned in Table 1
and their explanation in Table 2.
5.
1
EXPERIMENTAL SETUP
http://www.lemurproject.org/lemur/IndriQueryLanguage.php
Topics Used for Experiments
Topic T1 T2 T3 T4 T5
78 80 85 90 91
English to Hindi Document Analysis
Translation Coverage(ADJ)(After Exp) 63.9%
Translation Coverage(VB)(After Exp) 65.3%
Transliteration Error
13%
Table 3: Topics used for Experiments
Table 6: English to Hindi Document Analysis
Inter-Annotator agreement
Task
Agree Observed kappa
Subjective sentences
86.5% 0.689
(Opinions,Opinion targets) 73.7% 0.493
Table 4: Inter Annotator agreement
Information about collection and evaluation metrics used
for assessment is mentioned below.
Collection
Our experiments are based on 5 different topics selected from
FIRE 20102 collection. It is a comparable corpora consisting
of news articles in English and Hindi languages covering different areas like sports, politics, business, entertainment etc.
The relevance judgments are provided for the topics. Relevance judgments of Hindi and English are used to select the
relevant documents for a topic. Table 3 show the topics used
for experiments. For each topic, Hindi documents are used
to create a Gold standard dataset3 of subjective sentences
and tuples of opinion targets and opinions. While, relevant
documents in English are used for extraction of NE’s, adjectives and verbs. Gold standard dataset was prepared using two annotators who identified subjective sentences and
opinion word with its target if they existed in the document.
The inter-annotator agreement in identifying subjective sentences, opinions and opinion targets tuples averaged over 5
topics is given in Table 4.
Preprocessing on the Collection
English data from the FIRE 2010 collection is used to extract
NE’s, adjectives and verbs. Stanford NER4 and POS Tagger5 is used to extract NE’s and adjective, verbs respectively.
Manually prepared word-aligned bilingual corpus and statistics over the alignments is used to transliterate English to
top 3 Hindi words. For this Hidden Markov Model (HMM)
alignment and Conditional Random Fields (CRFs) are used.
For HMM alignment GIZA++6 is used while CRF++7 is
used for training the model. For translation of adjectives
and verbs, English-Hindi dictionary shabdanjali8 is used.
2
http://www.isical.ac.in/ fire/2010/index.html
Will be made available on request
4
http://nlp.stanford.edu/ner/index.shtml
5
http://nlp.stanford.edu/software/tagger.shtml
6
http://www.fjoch.com/GIZA++.html
7
http://crfpp.sourceforge.net/
8
http://www.shabdkosh.com/archives/content/
3
Word distribution of translated Opinions
Positive Negative Neutral Total
Adjective 151
143
577
871
Table 5: Word Distribution of translated Opinions
Before doing the translation adjectives and verbs are expanded with Wordnet9 . Table 6 shows translation coverage,
transliteration errors and Table 5 show the word distribution of translated opinions [9] averaged over 5 topics.
Evaluation
Relevant documents are retrieved for OQs created in each
topic. But, documents retrieved may not represent opinion about opinion targets present in OQs. Evaluation is
done using recall(Roq ), precision(Poq ) and F-measure(Foq )
for each OQ used for document retrieval to confirm whether
OQ terms represent opinion about opinion targets in that
document. Equation 15 and Equation 16 gives the metrics. Similar evaluation is done for subjective sentences using precision(Ps ), Recall(Rs ) and F-measure(Fs ).
Roq =
Retrieved OQs in Document
T otal OQs present in Document
(15)
Relevant OQs in Document
Retrieved OQs in Document
(16)
Poq =
Foq = 2 ∗
Poq ∗ Roq
Poq + Roq
(17)
We also calculated mean average precision(MAP) scores to
see whether the relevant documents are ranked first for the
corresponding OQs. If the MAP scores are high for a model
and its corresponding OQ, it can be derived that the model
and query is efficient in retrieving more relevant documents
first.
6.
EXPERIMENTS
In this section we evaluate subjective sentence extraction,
identification of opinion about opinion targets and ranking
of relevant documents using the proposed approach on the
gold standard dataset.
6.1
Detecting Subjective Sentences
Subjective sentences are identified using the two methods
mentioned in Section 3. To analyze the accuracy of proposed methods Ps , Rs and Fs is calculated. Table 7 show
the average scores for 5 topics. It can observed that Method
2 has 84.1% more recall but 7.3% low in precision compared
to Method 1. For opinion mining we feel precision matters more in-order to get accurate and efficient results. This
analysis was done in next section to analyze the accuracy of
opinions retrieved using this two methods. We also did comparison of the following approach with classification methods
by 10-cross validation on human annotated sentences using
unigrams as features. Table 8 show the 10-cross validation
results of learning methods.
We can observe that Method2 achieves 2.3% more F-measure
compared to naive bayes, but 4.8% less compared to SVM
9
http://wordnet.princeton.edu/
Topic
Method 1(M1) Method 2(M2)
(T1,T2,T3,T4,T5) Ps
0.573
0.534
Rs
0.543
1
Fs
0.557
0.696
Table 7: Subjective Sentence Accuracy
Topic
Naive Bayes SVM Decision Tree
(T1,T2,T3,T4,T5) Ps
0.71
0.73
0.69
Rs
0.66
0.73
0.73
Fs
0.68
0.73
0.71
Table 8: Subjective Sentence Classification
and 2.0% less compared to decision trees. Similarly, we observe that Method2 does not fair well in getting good Fmeasure. But was able to achieve good precision scores compared to learning methods. Since our approaches are unsupervised and achieves decent accuracy in identifying subjective sentences compared to supervised learning approaches.
We feel these approaches for resource scarce languages can
show significant results in identifying subjective sentences.
6.2
Detecting Opinions about Opinion Bearers
In our retrieval approach, query terms are used to confirm their presence in the document. So, SLM is created
from sentences obtained using Method 1 (M1) and Method
2 (M2). SLM is then extended with IN for forming OQs for
retrieval. Our approach is compared with other standard
retrieval approaches like LM based retrieval, LM with IN
retrieval using lemur toolkit10 . Since each topic can produce as many OQs given by Equation 18 from English collection. Only those queries which retrieved documents are
considered for evaluation.
T otal OQ0 s = T otal N E Hindi ∗ OW
OW = (T otal Adj + T otal V B 0 s) ∗ (N umof OQ0 s)
(19)
Documents Ranking
We analyzed the efficiency of OQs and models in retrieving the relevant documents first using mean average precision(MAP) scores. For that we calculated the MAP@4
scores of all the queries which retrieved at-least 4 documents. Average MAP@4 scores are calculated for 5 topics using baseline LM, LM with IN, SLM and SLM with IN
based retrieval using Dirichlet smoothing technique given by
Table 11 and Jelinek-Mercer smoothing by Table 12. Figure 3 and Figure 4 shows the MAP@4 scores calculated for
5 different topics using SLM+IN+M1 model and different
OQs using two different smoothing techniques.
7.
10
RESULT DISCUSSION
http://www.lemurproject.org/
Table 9: Results obtained using Dirichlet Smoothing
Model
UQ SQ1 SQ2 SQ3 SQ4 SQ5 SQ6
Baseline LM Poq 0.116
Roq 1.000
Foq 0.207
SLM+M1 Poq 0.125
Roq 1.000
Foq 0.222
SLM+M2 Poq 0.156
Roq 1.000
Foq 0.270 LM+IN
Poq 0.116 0.076 0.081 0.204 0.204 0.204 0.250
Roq 1.000 0.406 1.000 0.397 0.397 0.397 0.125
Foq 0.207 0.128 0.149 0.269 0.269 0.269 0.167
SLM+IN+M1 Poq 0.125 0.272 0.093 0.333 0.333 0.333 0.250
Roq 1.000 0.343 1.000 0.375 0.375 0.375 0.125
Foq 0.222 0.304 0.171 0.352 0.352 0.352 0.167
SLM+IN+M2 Poq 0.156 0.076 0.156 0.214 0.214 0.214 0.250
Roq 1.000 0.406 1.000 0.437 0.437 0.437 0.125
Foq 0.270 0.128 0.270 0.288 0.288 0.288 0.167
(18)
Poq , Roq and Foq is calculated for 5 topics using baseline LM,
LM with IN, SLM and SLM with IN based retrieval using
Dirichlet and Jelinek-Mercer smoothing techniques given by
Table 9 and Table 10 respectively. In each column best performing model and its corresponding query is highlighted.
6.3
Model
UQ SQ1 SQ2 SQ3 SQ4 SQ5 SQ6
Baseline LM Poq 0.156
Roq 1.000
Foq 0.270 SLM+M1 Poq 0.125
Roq 1.000
Foq 0.222
SLM+M2 Poq 0.156
Roq 1.000
Foq 0.270 LM+IN
Poq 0.156 0.076 0.093 0.214 0.214 0.214 0.250
Roq 1.000 0.406 1.000 0.397 0.397 0.397 0.125
Foq 0.270 0.128 0.171 0.278 0.278 0.278 0.167
SLM+IN+M1 Poq 0.125 0.272 0.093 0.333 0.333 0.333 0.250
Roq 1.000 0.343 1.000 0.375 0.375 0.375 0.125
Foq 0.222 0.304 0.171 0.352 0.352 0.352 0.167
SLM+IN+M2 Poq 0.156 0.076 0.167 0.214 0.214 0.214 0.250
Roq 1.000 0.406 1.000 0.437 0.437 0.437 0.125
Foq 0.270 0.128 0.286 0.288 0.288 0.288 0.167
Table 10: Results obtained using Jelinek-Mercer
Smoothing
MAP@4 (Dirichlet)
Model
UQ SQ1 SQ2 SQ3 SQ4 SQ5 SQ6
Baseline LM 0.277 SLM+M1
0.295 SLM+M2
0.285 LM+IN
0.277 0.294 0.275 0.310 0.310 0.320 0.322
SLM+IN+M1 0.295 0.314 0.295 0.325 0.320 0.392 0.275
SLM+IN+M2 0.285 0.312 0.283 0.314 0.310 0.361 0.275
Table 11: MAP@4 Values with Dirichlet Smoothing
MAP@4 (Jelinek-Mercer)
Model
UQ SQ1 SQ2 SQ3 SQ4 SQ5 SQ6
Baseline LM 0.284 SLM+M1
0.310 SLM+M2
0.300 LM+IN
0.284 0.298 0.282 0.302 0.310 0.320 0.310
SLM+IN+M1 0.310 0.348 0.324 0.315 0.320 0.351 0.294
SLM+IN+M2 0.300 0.334 0.310 0.315 0.315 0.344 0.294
Table 12: MAP@4 Values with Jelinek-Mercer
Smoothing
Figure 3:
Model
MAP@4(Dirichlet) for SLM+IN+M1
It can observed that the best performing queries from Table 9 which used Dirichlet smoothing technique are SQ3,
SQ4 and SQ5 of SLM+IN+M1 model in terms of F-measure
for retrieving opinions about opinion targets. These queries
in SLM+IN+M1 model outperformed LM+IN model by 26.6%
in F-measure. Similar comparison was made between the
performance of SQ3, SQ4, SQ5 of models SLM+IN+M1 and
SLM+IN+M2. It showed that recall of SLM+IN+M1 model
is 16.5% low, but performed 22.2% better in F-measure and
55.6% more in precision than SLM+IN+M2. This is contrasting to subjective sentence extraction results. Although
the SLM+IN+M2 model had large coverage of sentences.
The extracted sentences had weak subjective clues which
just improved its recall. But, SLM+IN+M1 model had
strong subjective sentences which improved its precision and
F-measure.
We can also observe and derive from table 9 that unstructured queries (UQ) in all the models achieve 100% recall.
But, precision levels vary between the models and are less
compared to structured queries in the same and across models. This can be attributed to the retrieval of documents
containing only query words but not documents which are
opinions about opinion targets. This clearly shows the need
for structured queries to achieve better performance.
Similar analysis is done for Table 10 which uses JelinekMercer smoothing. We can observe that SQ3, SQ4 and
SQ5 of SLM+IN+M1 model outperforms other methods and
queries in terms of F-measure. These queries in SLM+IN+M1
model perform 30.8% better compared to LM+IN model in
F-measure though its recall is 5.8% low. This is observed as
LM+IN does not have any constraints in sentence selection,
while SLM+IN+M1 have only subjective sentences. There
is same difference observed as in Dirichlet smoothing when
compared between SLM+IN+M1 and SLM+IN+M2.
Different smoothing methods did not show much difference in best performing queries, although, they had minor
differences in queries like SQ2.
From the Figure 3 and Figure 4 we can observe that opinion query SQ5 in SLM+IN+M1 model performs better than
other queries in retrieving relevant document first for different smoothing techniques. This shows the correlation
between the F-measure, as we saw that SQ5 outperforms
other queries in different models. In conclusion, we can say
that SQ5 of SLM+IN+M1 can be used to mine opinions to
achieve decent accuracy if the resources in the language are
less.
8.
Figure 4: MAP@4(JM) for SLM+IN+M1 Model
CONCLUSION AND FUTURE WORK
In this paper, we treated opinion mining in resource scarce
languages as a retrieval problem by leveraging comparable
corpora. Adjectives, verbs and NE’s depicted as opinions
on opinion targets are extracted from resource rich language
like English to mine resource scarce languages. Structured
opinion queries formed using opinions and opinion targets
are retrieved using LM, LM with IN, SLM and SLM with
IN having different smoothing techniques to confirm their
presence in documents. We found that SLM with IN performs better for opinion mining. In Future, more complex
queries are used which improves F-measure. Also, language
independent techniques needs to be explored as current technique depends on dictionaries for translation and transliteration.
9.
REFERENCES
[1] A Balahur, R. S. Rethinking sentiment analysis in
news: from theory to practice and back. In In
Workshop on Opinion Mining and Sentiment Analysis
(2009).
[2] C. Banea, R. M. J. W., and Hassan., S.
Multilingual subjectivity analysis using machine
translation. In EMNLP (2008).
[3] Donald Metzler, W. B. C. Combining the language
model and inference network approaches to retrieval.
[4] H. Takamura, T. I., and Okumura, M. Latent
variable models for semantic orientations of phrases.
In 11th Meeting of the European Chapter of the
Association for Computational Linguistics. (2006).
[5] J. Wiebe, T. W., and Bell, M. Identifying
collocations for recognizing opinions. In In ACL
Workshop on Collocation: Computational Extraction,
Analysis, and Exploitation. (2001).
[6] John, L., and Zhai., C. Document language models,
query models, and risk minimization for information
retrieval. In SIGIR (2001).
[7] L Jia, C. T. Y., and Zhang, W. Uic at trec 2008
blog track. In TREC (2008.).
[8] N. Kando, T. M., and Sakai., T. Introduction to
the ntcir-6 special issue. In ACM TALIP, 7(2) (2008).
[9] Piyush Arora, A. B., and Varma, V. Hindi
subjective lexicon generation using wordnet graph
traversal. In In Proceedings of CICLing. (2012).
[10] Riloff, E., and Wiebe., J. Learning extraction
patterns for subjective expressions. In EMNLP (2003),
pp. 105–112.
[11] S Gerani, M. J. C., and Crestani, F.
Proximity-based opinion retrieval. In SIGIR (2010).
[12] Soo-Min Kim, E. H. Automatic detection of opinion
bearing words and sentences. In In IJCNLP (2005).
[13] Soo-Min Kim, E. H. Extracting opinions expressed in
online news media text with opinion holders and
topics. In In Proceedings of the Workshop on
Sentiment and Subjectivity in Text at COLING-ACL.
(2006).
[14] Soo-Min Kim, E. H. Identifying and analyzing
judgment opinions. In In Proceedings of HLT/NAACL
(2006).
[15] Surya Ganesh, S. H. P. P., and Varma, V.
Statistical transliteration for cross language
information retrieval using hmm alignment model and
crf. In In Workshop on CLIA addressing the
Information Need of Multilingual Societies. (2008).
[16] T. Strohman, D. M. H. T., and Croft, W. B.
Indri: A language model-based serach engine for
complex queries. In International Conference on
Intelligence Analysis. (2004).
[17] Turtle, H., and Croft, W. B. Evaluation of an
inference network based retrieval model.
[18] Vincent Ng, S. D., and Arifin., S. M. N.
Examining the role of linguistic knowledge sources in
the automatic identification and classification of
reviews. In COLING/ACL, 611–618. (2006).
[19] Y. Hu, J. D. X. C. B. P., and Lu., R. A new
method for sentiment classification in text retrieval. In
IJCNLP, 1-9. (2005).
[20] Y. Suzuki, H. T., and Okumura., M. Application
of semi-supervised learning to evaluative expression
classification. In 7th International Conference on
Intelligent Text Processing and Computational
Linguistics. (2006).
[21] Yang, K. Widit trec blog track:leveraging multiple
sources of opinion evidence. In TREC (2008).
[22] Zagibalov, T., and Carroll., J. Automatic seed
word selection for unsupervised sentiment
classification of chinese text. In Conference on
Computational Linguistics. (2008).
[23] Zhai, C., and Lafferty, J. A study of smoothing
methods for language models applied to information
retrieval. 179–214.

Download Report

Retrieval approach to extract opinions about people from resource

Paperzz.com

Your Paperzz