Our goal is to annotate automatically relevant speculative sentences

Annotation guidelines
1) Goal
Our goal is to annotate in biological scientific papers all biological speculative sentences (i.e. sentences
containing at least one speculative fragment dealing with biological problem or question). In this task, we
consider only sentences with some clear instances of speculative language (the sentence must contain at least one
linguistic element expressing speculation). We also want to categorize them into “old speculation” and “new
speculation”.
2) Definition of biological speculations in papers
According to our analysis, it is possible to oppose schematically two types of statements in a biological paper
if we consider their degree of uncertainty:
1.
Demonstrated statements, which are established facts which are accepted by the scientific
community or the authors of the paper. These can be for example biological results, data,
observations;
2.
Speculations (non-demonstrated statement), which are proposals about a biological problem or
question, and explicitly presented as not certain in the paper. These can be for example hypothesis,
interpretations or possible explanations of a fact.
Between these two types of statements, there are others types of statement (deductions, conclusions,
argumentation, discussions…), which can be characterized as intermediary, because they present more or less
things as certain or they do not make a proposal. For these reasons, all these intermediary statements are NOT
considered as speculative.
3) Annotation of speculative sentences: description of the task
All speculations in the following examples are underlined.
Firstly, note that a sentence have to be annotated as a speculative sentence, even if the speculation is just a
clue in the sentence and not the entire sentence. Indeed, a sentence can either express only a speculation (first
sentence), or combine the description of a result, which is a demonstrated statement, and its interpretation, which
is a speculation (second sentence):
1.
“The purpose of this study was to test whether two zeolites produced synthetically (products of zeolitic
nature, PZN) could influence either the yield of a diatom culture or the chemical changes in the
cultures.”
2.
“In contrast, when SIT protein levels peaked during valve synthesis, mRNA levels decreased,
suggesting either increased translational efficiency or an inhibition of protein degradation.”
Secondly, consider that we want to annotate ONLY relevant speculative sentences, i.e. speculative sentences
carrying information about the content of the speculation. This information can be:
1
a) Detailed speculation:
“These data suggest that, at least in transient transfection assays, cohesins contribute to CTCF-
1.
dependent insulator function.”
“Interestingly, p97/VCP knockdown led to a significant increase in HSF1–HSP90 association, strongly
2.
suggesting that p97/VCP controls the basal level of complex formation.”
“Accordingly, we hypothesized that ubiquitin binding by HDAC6 could act as a sensor of misfolded
3.
protein accumulation and elicit HSF1 activation and heat-shock gene response through a mechanism
involving HSP90.”
“This hypothesis states that the interaction of silaffins with LCPA inside the SDV leads to the build up
4.
of a nanopatterned organic matrix, the structure of which is controlled by the silaffins.”
b) Or just the name or a part of the speculation:
1.
“The hypothesis proposed here for degradation of chrysolaminaran has implications for the generation
of glucosyl phosphate intermediates from an energy storage glucan.”
2.
“Based on this, we propose the following model to explain the kinetic properties of diatom Si(OH)4
uptake (Fig. 8).”
3.
“Elasticity maps and histograms were generated by analysing the force curves according to the Hertz
theory for elastic media (Hertz, 1881), using a conical tip geometry:”
4.
“This would seem to provide further evidence for the theory of Gross and Meyer [119] for a divergence
in algal metabolism based on InDH, but the other P. tricornutum sequence (34720) was found within
the clade formed by the red algal sequences.”
4) Categorization of speculative sentences
All extracted speculative sentences have to be categorized into “old speculation” or “new speculation”.
“Old speculation” is a speculative sentence cited in the paper but presented as having been proposed
previously in an other paper. Examples:
1.
“Zinc has been proposed to be an essential component of Si transport (Rueter and Morel, 1981) by
binding to a functional site on SITs and facilitating the formation of a ternary complex with silicic acid
(Sherbakova et al., 2005).”
2.
“In algae, it has been suggested that two types of glycolate-oxidizing enzymes exist: a glycolate
oxidase
in
Chrysophyceae,
Eustigmophyceae,
Raphidophyceae,
Xanthophyceae
and
Rhodophyceae, and a glycolate dehydrogenase in Chlorophyceae, Prasinophyceae, Cryptophyceae
and Bacillariophyceae [68].”
3.
“Previous models of DLA have led to the formation of highly branched fractal-like aggregates.”
4.
“These recent results with Si and monocots bring not only further support to the theory that Si plays
an active role in protecting plants against pathogens, but indicate that this role is not specific to
dicots but rather generalized to the plant kingdom.”
“New speculation” is a speculative sentence presented for the first time in the paper or not explicitly
presented as old speculation. Examples:
1.
“First, with respect to macronutrient uptake and ecosystem dynamics, we hypothesize that in addition
to magnitude, the stoichiometry of macronutrient and Fe supply to HNLC surface waters is equally
critical in determining whether blooms are transient (weeks) or sustained (months).”
2
2.
“The low levels of TpSIT3 mRNA compared to those of TpSIT1 and TpSIT2 mRNA seen in both
synchronized cultures (Fig. 4) and a culture gradually starved for silicon (data not shown) lead us to
speculate that TpSIT3 may serve as a silicon sensor in T. pseudonana.”
3.
“It is assumed in this study that silicon layers in epidermal cell walls can confer enhanced host
resistance to blast.”
4.
“A second possibility is that iron is directly incorporated into the silica cell wall in a regulated manner.”
5.
“Strong trophic coupling and inefficient organic export may be general characteristics of community
perturbation responses in the warm waters of the Pacific Ocean.”
5) Non speculative sentences
In order to better clarify what a speculative sentence is, we present here some of the most frequent types of
sentences which are NOT speculative.
a) Results or conclusions: the opposite of speculations
Here are presented some sentences expressing results, data, observation or conclusions, which are NOT
considered as speculative sentences. Some of these sentences contain some specific expressions of a result (“we
observed that”, “we found”, “demonstrated that”), whereas the others just express a result or a conclusion,
without indicating that it has (or not) been completely demonstrated.
1.
“Several enzymes have been implicated in the generation of ROS during defense responses in a
number of plant-pathogen interactions.”
2.
−
“NADPH oxidase, known as the respiratory burst oxidase, catalyzes the production of O2 by the oneelectron reduction of molecular oxygen using NADPH as an electron donor.”
3.
“We observed that the CaPO2 gene was strongly induced during the incompatible interaction of
pepper plants with the Xcv avirulent strain Bv5-4a (Fig. 2B).”
4.
“Silencing of the CaPO2 gene compromised not only oxidative bursts and the HR in local infected
leaves but also reduced microbursts and micro-HR in uninoculated secondary leaves 24 h after
infection with the Xcv avirulent strain Bv5-4a (Fig. 6A).”
5.
“In addition, we found abnormally enlarged nuclei in some cells of AMPK-GLC embryos.”
6.
“Mitotic chromosome staining with anti-phospho-histone H3 (PH3) antibody demonstrated that AMPKGLC embryos frequently contained defective mitotic cells with lagging or polyploid chromosomes.”
7.
1
“Notably, the larval brains of MRLC loss-of-function mutants (spaghetti-squash ) showed extensive
14
polyploidy (40% of mitotic neuroblasts) , and their imaginal discs showed severe disorganization in
epithelial structure (Supplementary Fig. 13), similar to those of lkb1- and AMPK-null mutants.”
8.
“Two major auditory afferents of the LA (cortico-amygdala and thalamo-amygdala) form two distinct
types of synapses on the same target principal neurons (14).”
9.
“Western blot analysis with a pan-neuroligin antibody revealed that the neuroligins were expressed in
the amygdala of rat brain in both perinatal and postnatal stages and that their expression is slightly
up-regulated during postnatal development (Fig. 1 A).”
10. “These phenotypic similarities further support our conclusion that MRLC is an important functional
mediator of LKB1 and AMPK.” In this last sentence, it could be possible that it is more a speculation
than a result, but the use of the word “conclusion” is very strong and illustrate an established fact.
3
b)
Inference
Here are some sentences expressing an inference from observable evidences or from laws or general
knowledge. Inference is NOT considered as a speculation.
1.
“Thus, Ezh2 must be the RNA-binding subunit of PRC2 (fig. S1).”
2.
“Therefore, RepA, together with PRC2, is required for the initiation and spread of XCI.”
c)
Argumentation
Argumentative sentences are NOT considered as speculation.
1.
“We found evidence that gene variants related to iron metabolism increase the impacts of low-level
lead exposure on the prolonged QT interval.”
d)
Open questions
We do NOT consider as speculation the following sentences, because they only ask an open question about a
biological problem without proposing a mechanism.
1.
“How endocytosis of DI leads to the activation of N remains to be elucidated”
2.
“Biochemical analysis of the ubiquitination events regulated by D-mib will be needed to further define
the mechanism by which D-mib regulates the endocytosis of Ser in vivo.”
e)
Non informative sentences
A sentence discussing about a speculation without being informative about the content of the speculation is
NOT considered as a speculation, because we are not interested to know whether a speculative sentence is purely
speculative, or if it is supported by some facts or even is demonstrated.
1.
“Further experiments are required to test this hypothesis more rigorously.”
2.
“The observation, in places, of thin, amber-coloured films (the cyst walls) surrounding this type of
silt grain (best seen in tapered wedges at the edge of thin sections) supports this assumption.”
4