Constructions represented in parallel with structural and
lexical items for identifying presence of attitude robustly
across domains in text
Jussi Karlgren, Gunnar Eriksson, Magnus Sahlgren, Oscar Täckström
ABSTRACT
This paper describes experiments to use non-terminological
information to find attitudinal expressions in written English text. The experiments are based on an analysis of text
with respect to not only the vocabulary of content terms
present in it (which most other approaches use as a basis for
analysis) but also on structural features of the text as represented by presence of form words (in other approaches often
removed by stop lists) and by presence of constructional
features (typically disregarded by most other analyses). In
our analysis, following a construction grammar framework,
structural features are treated as occurrences, similarly to
the treatment of vocabulary features. The constructional
features in play are chosen to potentially signify opinion but
are not specific to negative or positive expressions.
The framework is used to classify clauses, headlines, and
sentences from three different shared collections of attitudinal data. We find that constructional features transfer well
and show potential for generalisation across different text
collections.
1.
ATTITUDE ANALYSIS IS MOSTLY BASED
ON LEXICAL STATISTICS
Attitude analysis, a subtask of information refinement from
texts, has gained interest in recent years, both for its application potential and for the promise of shedding new light on
hitherto unformalised aspects of human language usage: the
expression of attitude, opinion, or sentiment is a quintessentially human activity. it is not explicitly conventionalised to
the degree that many other aspects of language usage are.
Most attempts to identify attitudinal expression in text
has been based on lexical factors. Resources such as SentimentWordNet or the General Inquirer lexicon are utilised, or similar resources developed, by most research groups engaged
in attitude analysis tasks.[3, 12] But attitude is not a solely
lexical matter. Expressions with identical or near-identical
terms can be more or less attitudinal by virtue of their form;
combinations of fairly attitudinally loaded terms may lack
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SIGIR 2009 2009 Boston USA
Copyright 2009 ACM X-XXXXX-XX-X/XX/XX ...$5.00.
attitudinal power; certain terms considered neutral in typical language use can have strong attitudinal loading in certain discourses or certain times.
Our approach takes as its starting point the observation
that lexical resources always are noisy, out of date, and
most often suffer simultaneously from being both too specific
and too general. Not only are lexical resources inherently
somewhat unreliable or costly to maintain, but they do not
cover all the possibilites of expression afforded by human linguistic behaviour: we believe that attitudinal expression in
text is not solely a lexical issue. We have previously tested
resource-thrifty approaches for annotation of textual materials, arguing that general purpose linguistic analysis together
with appropriate background materials for training a general
language model provide a more general, more portable, and
more robust methodology for extracting information from
text.[?] 1
2.
CONSTRUCTIONS AS CHARACTERISTIC FEATURES OF UTTERANCES
Most categorisation features used for any type of text or text
snippet categorisation are term occurrence based. Utterances are seen as sequences or bags of words: “w1 w2 ...wi ...wj ...wn ”
and the observations of ws are subjected to frequency or occurrence analyses to yield features such as frequency features:
Is some term wi unusually frequent or infrequent?, cooccurrence features: Is some combination of terms wi , wj unusually frequent or infrequent? or equivalence classes: Can some
term wi be substituted or generalised to a class or concept
marker?
Our hypotheses are that investigating utterances for presence of content-bearing words may be useful for identifying
attitudinal expressions, but that finding structural features
carries over easier from one topical area to another, from
one discourse to another. We view utterances as more than
the words that appear in them. The pattern of an utterance
is an observable item in itself.
It has previously been suggested that attitude in text is
carried by dependencies among words, rather than by keywords, cue phrases, or high-frequency words.[1] We agree,
but in contrast with previous work, we explicitly incorporate constructions in our knowledge representation, not as
relations between terms but as features in their own right.
[8]
This paper describes an experiment to investigate the attitudinal power of linguistic constructions in utterances. It
1
Reference and details of experiment omitted for review.
compares the effect of constructional features by using a test
set together with a reasonably chosen background text collection, and then using the same method on a test set with
different topical content.
For these present experiments reported in this paper no
attitudinal lexical resources were used — only general purpose linguistic analysis was employed to establish the constructions used in the further processes.
3.
FEATURE SETS: TERMS AND CONSTRUCTIONS
The texts used in this experiment are viewed as sequences of
sentences: the sentence is taken as the basis of analysis, as a
proxy for the utterance we view as the basis for attitudinal
expression. All texts in this experiment are preprocessed by
a linguistic analysis toolkit2 , resulting in a lexical categorisation of each word and a full dependency parse for each
sentence.
From that analysis, three types of features are extracted
to represent sentences: content words (I), function words
(F ) and construction markers (K).
3.1
Content and function words
All words that are assigned a content part-of-speech category3 by the lexical analysis are considered members of the
content word (I) class and the base form of such words are
used as I features when occurring in a sentence.
All further words in a sentence, belonging to remaining
classes of part-of-speech 4 are judged function words and
their base forms are used as F features in the sentence representation.
3.2
Construction markers
Besides word occurrence based feature classes we introduce a
further feature class intended to capture aspects of the constructions in employ in the sentence. Some of these constructional features (K) concern clause semantics and sentence or
clause structure — such as the transitivity of the clauses in
the sentence, the occurrence of objective that-clauses or relative clauses, the occurrence of predicate constructions, the
occurrence of manner, spatial, and temporal adverbials, etc.
Other construction markers concern morphological features
such as tense forms of verbs present in the sentence or the
degree of comparison of occurring adjectives.
As in the case of the word-based features, these features
are extracted from the linguistic analysis. Most of them
are based directly from the available information about the
morphological or dependency status of a certain word in the
sentence, while some other features need the aggregation of
information from several words or different analysis levels.
The palette of K features studied is chosen manually.
In this experiment all constructional K features are treated
as sentence features, exactly as the lexical I and F features
are treated, i.e., no coupling between the features and the
words carrying them is performed.
2
The Connexor Functional Dependency (FDG) parser for
English [15]
3
In this experiment nouns, adjectives, verbs (including
verbal uses of participles), adverbs, abbreviations, numerals, interjections, and negation are considered content words.
4
prepositions, determiners, conjunctions, pronouns,
...
4.
TEST DATA
We base our experiment of data used in the NTCIR information retrieval evaluation challenge organised by NII, Tokyo,
in its English section of the opinion analysis task. The data
have been used by several research groups in a shared task
for the last two workshops (NTCIR 6 and NTCIR 7) and we
make use of the assessments for this experiment. In comparison, our classifier appears to yield a tie with the reported
best result from the shared opinion identification task. (The
NTCIR task also involved opinon classification, identifying
polarity of the expressed opinion. We have not attempted
that task here: it arguably has a stronger lexical base than
that of identifying whether any attitude is expressed or not.)
For generalisation we added to the NTCIR test sentence
set the multi-perspective question answering (MPQA) test
sentence set with assessed attitudinal sentences [13] and the
2007 Semantic Evaluation Affective Task (SEMEVAL) test
set of news headlines [14], both of which have assessments
by human judges. We use a lenient scoring scheme, scoring
a sentence as attitudinal if two out of three NTCIR judges
have marked it attitudinal; for the SEMEVAL data if the
intensity score is over 50 or under -50. All attitudinal sentences or headlines, irrespective of source, are assigned the
class att and all other sentences assigned the class noatt.
Statistics for the collection are given in Table 1. Some sentences from the MPQA and NTCIR test sets, about ten in
total, yielded no analyses and were removed from the test
set.
5.
BACKGROUND WORD SPACE MODEL
Our experiment is based on a background language representation built by analysis of a reasonable-sized general text
collection. We then use that model to establish similarities
and differences between the sentences under analysis. Our
aim is to investigate how the utterance or sentence under
consideration is related to language usage in the norm, either by deviation from the norm in some salient way, or by
conforming with an identified model of usage.
In this experiment we use one year of newsprint from two
Asian English-language news sources, the Korean Times and
the Mainichi Daily with collection sizes as shown in Table 2.
The collections are distributed as part of the NTCIR information retrieval evaluation challenge and have been used by
several participants for training language models for NTCIR
tasks, among them an opinion and attitude analysis task.[6,
11] As a control collection, we use one year of the Glasgow Herald, distributed as part of the CLEF information
retrieval evaluation challenge.[2]
For the background text material, we segment the text
into sentences and process each sentence to extract the features given above — I, F , K. This gives us a high-dimensional
feature space. We use this to build a cooccurrence-based
first-order word space[10, 9], with all three types of features
treated alike, using random indexing [7] for dimension reduction. In this word space, or feature space, each feature is
accorded a position in a vector space based on which other
features it cooccurs with in the training sentences.
Initially, each sentence is given a thousand-dimensional
representation vector with two randomly chosen non-null elements {1, −1}. Each feature is also given an initially empty
context vector of the same dimensionality. This context vector is trained by scanning through each sentence in turn:
I
F
K
I
F
K
I
F
K
“It is this, I think, that commentators mean when they say glibly that the ‘world changed’ after Sept 11.”
be think commentator mean when say glibly world change sept 11
it this i that they that the after
Adverbial of time, Adverbial of manner, That subclause, Predicative, Intransitive clause,
Transitive clause, Transitive mix, Present tense, Past tense, Tense shift
“President Hafez Al-Assad has said that peace was a pressing need for the region and the
world at large and Syria, considering peace a strategic option would take steps towards peace.”
president hafez al-assad have say peace be pressing need region world at large syria consider
peace strategic option would take step peace
that a for the and the and a towards
Adverbial of manner, That subclause, Predicative, Intransitive, Transitive clause, Transitive mix,
Present tense, Past tense, Tense shift, Verb chain
“Mr Cohen, beginning an eight-day European tour including a Nato defence ministers’ meeting in
Brussels today and tomorrow, said he expected further international action soon, though not
necessarily military intervention.”
mr cohen begin eight-day european tour include nato defence minister meeting brussels
today tomorrow say expect international action
soon though not necessarily military intervention
an a in and he further
Adverbial of time, Adverbial of place, That subclause, Transitive clause, Past tense
Figure 1: Example attitude analyses of sentences. These sentence are taken from the NTCIR opinion analysis
task data set. The first two sentences are assessed by task judges to be opinion carriers, the last non-opinion.
The content word feature “say” is a strong marker for opinion but would yield the wrong categorisation
in this case; our linear classifier correctly identified the first two sentences as attitudinal and the last as
non-attitudinal.
Attitudinal
Non-attitudinal
Total
NTCIR 6
1 392
4 416
5 808
NTCIR 7
1 075
3 201
4 276
SEMEVAL
76
174
250
Table 1: Test sentence statistics
MPQA
6 021
4 982
11 003
Glasgow Herald rank
75
2 158 196
452M 269
281
290
Table 2: Background text materials
385
502
505
F1-scores / Relative Feature Weights
680
686
746
780
813
969
1055
1105
F1 (ATT)
1673
Weight I
2222
Weight K
Characters
Sentences
100
90
F1 / Relative Feature Weight
80
70
60
50
40
30
20
Korean Times
326 486
61M
Mainichi Daily
123 744
25M
2500
2000
1500
1000
Number of Features
500
0
Figure 2: Top: F1-score and relative weights of the
feature types I,F and K in the identification step. I
features are removed in favor of F and K features.
for each feature present adding in the representation vector
for that sentence to the feature’s context vector sentence.
Thus, each feature will in its context vector carry information about every sentence it has occurred in, and these vectors will be grow to be similar with features it has cooccurred
with. We use this word space or feature space to be able to
generalise from observed features in a sentence to other features and to establish similarities between sentences based
on their feature values, even when there is little or no feature overlap. Features which occurred only once in the data
were removed.
6.
f eature
Tense shift
Past tense
Verb particle
Transitivity mix
Adverbial of quantity
Predicative
Quantifier
Non-transitive clause
Negation
Adverbial of time
Transitive clause
Prepositional modifier
Non-adverbial prepositional phrase
Comparative adjective
Verb chain
Present tense
Adverbial of condition
Weight F
10
0 3000
weight
1.874
1.375
1.356
1.338
1.206
1.082
1.080
0.943
0.938
0.902
0.881
0.862
0.774
0.738
0.712
0.468
0.285
INVESTIGATING RELATIVE FEATURE
STRENGTH
In order to improve generalization ability and to gain some
insight into which features show most utility for attitude
identification we first performed some exploratory analyses
on the NTCIR 6 and 7 test sets using NTCIR 6 as labeled
training materials, testing on NTCIR 7 data, without using
the background materials.
We found that relative scoring of the strongest features
in the discrimination model scored certain of our manually
chosen features very highly compared to I and F features.
A list of the highest-scoring K features are given in Table 3. Tense and transitivity measures, e.g., scored highly:
“Tense shift”, the strongest single K feature is found in a
sentence where the verbs of the main clause and the subordinate clause have different tense forms. This occurs often in sentences of utterance or cognition: “Noam Chomsky
saidpast that what makes human language unique ispresent
recursive centre embedding”; “M.A.K. Halliday believedpast
that grammar, viewed functionally, ispresent natural”. The
tense shift feature obviates the need of acquiring and main-
Table 3: Provisional analysis of relative feature
weight — only K features shown. The weight given
does not indicate whether the feature is loaded towards attitudinal or towards non-attitudinal sentences.
taining lists of utterance, pronuncement, and cognition - categories which have obvious relation to attitudinal expression.
We also tested the relative strength of the features using a feature set selection technique, SVM-RFE (Support
Vector Machine - Recursive Feature Elimination) [5]. SVMRFE exploits the duality between the feature space and the
instance space in linear discriminant models. The feature selection is conducted in a backward elimination procedure, at
each iteration removing the feature with the least influence
on the decision boundary. The advantage of this feature selection algorithm compared to traditional algorithms based
on e.g. mutual information, is that inter-dependencies between features are taken into account. This is important,
since it might well be the case that a certain construction or
function word is only informative when combined with lexical information. Due to its greedy nature SVM-RFE only
finds a locally optimal feature set. Still, it can give valuable information on general characteristics of the problem
at hand.
The results of the feature selection gives us an indication
of the utility of the different feature types for the different steps. As illustrated by figure 2 constructions (K) and
function words (F ) can be used instead of content words (I)
— the weight of the K feature set rising towards the end
when only few features are retained. This is in keeping with
our hypothesis that using K features yields a classifier with
better generalization ability.
7.
CLASSIFICATION EXPERIMENT
Each test sentence was represented by the centroid of its
feature set in the respective background feature space. Using the centroids, we used a support vector machine with a
linear kernel, as implemented in the open source liblinear
library [4] to build a classifier for the att-noatt distinction.
After some initial parameter setting tests, data were scaled
to approximately to a range of approximately −1...1 and
standard settings were used; since the class sizes were un-
balanced, the penalty weight for classes were given in inverse
proportion to their size.
We then ran five-fold crossvalidation of on each set to
establish classification performance for the feature sets. This
test was performed for each of the I, F , and K sets, and
combinations thereof, yielding seven feature combinations
for each set of test sentences and each background collection.
The NTCIR classification results are shown in Table 4.
We find that the combination of all three feature sets
gives consistenly high results; that there appears to be a
fair overlap between the various feature sets; and that the
results are stable across the two background collections. We
also find that the K features set consistently outperforms F
and I, albeit with a slim margin: since the test sentences
and the background texts are from similar sources (Asian,
English-language newsprint) and have similar characteristics
in many respect, the K features, tied to the particular genre
at hand, models this variation quite faithfully.
The same processing was then performed on the two other
sets of test sentences, MPQA and SEMEVAL. The classification performance is given in Table 5. News headlines (the
SEMEVAL set) quite obviously give less purchase for classification. The tables given here do not provide full precision
and recall for each set5 but the low performance reported for
the SEMEVAL set for the F feature set is related to very
low recall. This is not surprising, since the language given
in news headlines typically is quite terse, with structural
cue words omitted. The MPQA set, by contrast, provides
a much more stable result, and somewhat contrary our hypotheses, the I features carry over reasonably well to this
set.
Moving to Glasgow Herald, the third background collection, unrelated to any of the four test sentence sets, we
find that the NTCIR test sentences behave similarly to the
MPQA and SEMEVAL test sentences do for the NTCIR
background collections: the primacy of the I feature set is
no longer as given.
8.
TAKE-HOME MESSAGE
With increasingly sophisticated semantic relations being mined
from data, processing must take a more sophisticated view of
the linguistic signal than simple containers of topical words.
Many approaches begin by assuming the structure of the
linguistic data primarily to be relations between topical elements. In constructional approaches, the constructions, the
combinational patterns, themselves are accorded presence
in the signal — these experiments are intended to examine
their potential to capture attitudinal expression.
We find that representing constructions, even hand chosen
constructions such as the ones given in this experiment, especially given unrelated general language background data,
can provide a reliability which well matches or even surpasses that of word occurrence, the arguably primary carrier
of information in the linguistic signal.
Using constructions in parallel with word occurrence features not only has theoretical motivation from the construc5
Originally we thought the F1 score would suffice since for
almost all measurements the relative rank of results are the
same for F1, Precision and Recall and thus the F1 would illustrate the relative performance of the various feature sets
more lucidly. Probably the full information should be given
for the final version of the paper, even at the cost of cluttering up the tables.
tion grammar framework, but also provides a convenient and
familiar processing model and a straightforward extension
for term based models. From a philological standpoint, a
bottom-up approach to data analysis, examining the power
of constructions as ontological items, would appear to be
better motivated than basing information processing on descriptive language models, originally intended for description of human behaviour, for comparative studies of world
languages, or for the scholarly instruction of foreign languages.
I
F
K
IF
IK
FK
IF K
Precision range
Recall range
NTCIR 6
Korean Times Mainichi
43.2
42.0
43.2
46.0
46.0
44.5
48.4
34-41
52-60
Daily
40.0
42.2
42.8
44.2
44.0
44.2
46.7
34-40
45-58
NTCIR 7
Korean Times Mainichi
44.6
44.2
45.7
48.2
47.8
46.5
50.0
35-42
55-65
Daily
42.2
44.6
45.7
46.9
46.6
46.0
48.7
35-40
52-65
Table 4: Classification performance (F1) for the NTCIR data set
I
F
K
IF
IK
FK
IF K
Precision range
Recall range
MPQA
Korean Times Mainichi
65.4
62.6
63.1
67.2
67.8
64.0
68.6
66-74
56-64
Daily
61.0
62.3
62.4
64.6
65.1
63.5
66.5
66-73
55-61
SEMEVAL
Korean Times Mainichi
32.7
15.4
30.0
40.0
32.7
38.4
39.8
20-32
12-54
Daily
32.1
16.7
34.3
37.2
31.2
36.9
38.0
24-29
12-55
Table 5: Classification performance (F1) for MPQA and SEMEVAL test sentences
NTCIR 6
I
F
K
IF
IK
FK
IF K
Precision range
Recall range
69.8
68.8
70.9
69.9
69.9
67.9
67.7
37-42
49-55
NTCIR 7 MPQA
Glasgow Herald
45.2
63.8
47.6
65.4
43.6
63.7
47.4
67.3
48.6
67.0
47.9
68.0
48.6
69.2
37-41
71-75
53-60
57-63
SEMEVAL
42.4
40.4
33.8
41.4
38.7
37.5
41.8
26-34
46-57
Table 6: Classification performance (F1) for all test sets per new background data.
9.
REFERENCES
[1] X. Bai, R. Padman, and E. Airoldi. On learning
parsimonious models for extracting consumer
opinions. In Proceedings of HICSS-05,the 38th Annual
Hawaii International Conference on System Sciences,
page 75b, Washington, DC, USA, 2005. IEEE
Computer Society.
[2] M. Braschler and C. Peters. Cross-language evaluation
forum: Objectives, results, achievements. Information
Retrieval, pages 7–31, 2004.
[3] A. Esuli and F. Sebastiani. Sentiwordnet: A publicly
available lexical resource for opinion mining. In
Proceedings of the Fifth International Conference on
Language Resources and Evaluation (LREC 2006).
2006.
[4] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang,
and C.-J. Lin. LIBLINEAR: A library for large linear
classification. Journal of Machine Learning Research,
9:1871–1874, 2008.
[5] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene
selection for cancer classification using support vector
machines. Machine Learning, 46(1-3):389–422, 2002.
[6] N. Kando. Overview of the seventh ntcir workshop. In
Proceedings of the 7th NTCIR Workshop Meeting on
Evaluation of Information Access Technologies:
Information Retrieval, Question Answering, and
Cross-Lingual Information Access. NII, Tokyo, 2008.
[7] P. Kanerva, J. Kristofersson, and A. Holst. Random
indexing of text samples for latent semantic analysis.
In Proceedings of the 22nd Annual Conference of the
Cognitive Science Society, page 1036. Erlbaum, 2000.
[8] J.-O. Östman and M. Fried, editors. Construction
Grammars: Cognitive grounding and theoretical
extensions. John Benjamins, Amsterdam, 2005.
[9] M. Sahlgren. The Word-Space Model. PhD thesis,
Stockholm University, Department of Linguistics,
2006.
[10] H. Schütze. Word space. In S. Hanson, J. Cowan, and
C. Giles, editors, Advances in Neural Information
Processing Systems 5. Morgan Kaufmann Publishers,
1993.
[11] Y. Seki, D. K. Evans, L.-W. Ku, L. Sun, H.-H. Chen,
and N. Kando. Overview of multilingual opinion
analysis task at ntcir-7. In Proceedings of the 7th
NTCIR Workshop Meeting on Evaluation of
Information Access Technologies: Information
Retrieval, Question Answering, and Cross-Lingual
Information Access. NII, Tokyo, 2008.
[12] P. Stone. Thematic text analysis: new agendas for
analyzing text content. In C. Roberts, editor, Text
Analysis for the Social Sciences, chapter 2. Lawrence
Erlbaum Associates, 1997.
[13] V. Stoyanov, C. Cardie, D. Litman, and J. Wiebe,
editors. Evaluating an Opinion Annotation Scheme
Using a New Multi-Perspective Question and Answer
Corpus, Stanford University, California, Mar. 2004.
AAAI. AAAI Technical Report Series SS-04-07. Menlo
Park: AAAI Press. ISSN 978-1-57735-219-x.
[14] C. Strapparava and R. Mihalcea. Semeval-2007 task
14: Affective text. In Proceedings of the Fourth
International Workshop on Semantic Evaluations
(SemEval-2007), pages 70–74, Prague, Czech
Republic, June 2007. Association for Computational
Linguistics.
[15] P. Tapanainen and T. Järvinen. A non-projective
dependency parser. In In Proceedings of the 5th
Conference on Applied Natural Language Processing,
pages 64–71, 1997.
© Copyright 2026 Paperzz