behavioral and ERP indicators of truth evaluation processes

doi:10.1093/scan/nss042
SCAN (2013) 8, 647^ 653
Validating the truth of propositions: behavioral and
ERP indicators of truth evaluation processes
Daniel Wiswede,1,2 Nicolas Koranyi,1 Florian Müller,1 Oliver Langner,1 and Klaus Rothermund1
1
Friedrich-Schiller Universität Jena, Department of General Psychology, Am Steiger 3, Haus 1, D-07743 Jena, and 2Department of Neurology,
University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany
We investigated processes of truth validation during reading. Participants responded to true and false probes after reading simple true or false
sentences. Compatible sentence/probe combinations (true/true, false/false) facilitated responding compared with incompatible combinations (true/
false, false/true), indicating truth validation. Evidence for truth validation was obtained after inducing an evaluative mindset but not after inducing a
non-evaluative mindset, using additional intermixed tasks requiring true/false decisions or sentence comparisons, respectively. Event-related potentials
revealed an increased late negativity (500–1000 ms after onset of the last word of sentences) for false compared with true sentences. Paralleling
behavioral results, this electroencephalographic marker only obtained in the evaluative mindset condition. Further, mere semantic mismatches between
subject and object of sentences led to an elevated N400 for both mindset conditions. Taken together, our findings suggest that truth validation is a
conditionally automatic process that is dependent on the current task demands and resulting mindset, whereas the processing of word meaning and
semantic relations between words proceeds in an unconditionally automatic fashion.
Keywords: truth validation; sentence verification; reading comprehension; N400; late negativity; irrelevant stimulus–response compatibility
INTRODUCTION
For experienced readers, reading is a more or less effortless process.
Research on reading comprehension has shown that the meanings of
words, as well as the semantic and syntactic relations between the
words of a sentence are processed automatically when we are reading
text (e.g. Balota et al., 1990). For example, semantic (‘A robin is a
truck.’) or syntactic mismatches (‘A robin is a quickly.’) are detected
automatically and produce elevated Event-related potentials (ERPs)
(N200, N400, P600) while reading the last word of a mismatching
sentence (Kutas and Hillyard, 1980, 1984; Fischler et al., 1983, 1984;
Friederici et al., 1993; Hahne and Friederici, 1999, 2002).
The fact that processes of language comprehension and meaning
extraction proceed in an automatic fashion is well established in the
literature; it is less clear, however, whether mere reading also produces
automatic truth validation going beyond what is explicitly stated in a
sentence. The decisive question of this study is whether sentencerelated factual world knowledge that is stored in long-term memory
also becomes automatically activated upon reading and understanding
the sentence, and whether this knowledge is used to evaluate the truth
status of the respective sentence.
Previous ERP studies have shown that semantic mismatches between
the subject and the object of a sentence produced elevated ERPs that
occurred irrespective of the truth value of the proposition (Fischler
et al., 1983): a strong N400 of equal magnitude was observed both
for the false sentence ‘A robin is a truck.’ and for the true sentence
‘A robin is not a truck.’, with no systematic differences in ERPs
between false and true sentences. This seems to suggest that the
semantic consistencies between all words within a sentence are checked
as part of an automatic comprehension process, but that the detection
Received 18 March 2012; Accepted 24 March 2012
Advance Access publication 29 March 2012
This research was conducted at the University of Jena. We want to express our thanks to Nils Meier for support
in programming the experiment. This study was supported by regular funding of the Department of General
Psychology (Chair: Prof. Dr. Klaus Rothermund) by the FSU Jena.
Correspondence should be addressed to Dr Daniel Wiswede, Department of Neurology, University of Lübeck,
Ratzeburger Allee 160, 23538 Lübeck, Germany, or to Prof. Dr. Klaus Rothermund, Department of General
Psychology, FSU Jena, Am Steiger 3, Haus 1, D-07743 Jena, Germany.
E-mail: [email protected]; [email protected].
of semantic matches or mismatches does not automatically lead to
validation of whether sentences are true or false.
Recently, however, this conclusion has been called into question by
a study investigating whether reading simple sentences that were
either obviously true (e.g. ‘Perfume contains scents.’) or obviously
false (e.g. ‘Soft soap is edible.’) led to an automatic activation of
their corresponding truth values (Richter et al., 2009).1 In support of
such an automatic truth validation, Richter et al. (2009) have shown
that reading true (false) sentences activated ‘correct’ (‘wrong’) responses that either facilitated or interfered with orthographic decisions
(pressing a ‘correct’ or ‘wrong’ button) regarding the last word of the
respective sentence. Apparently, both the truth values of sentences and
the corresponding responses were activated during reading, even
though a truth validation was not required by the orthographic task.
Critically, however, it is possible that performing correct/wrong orthography decisions produced an evaluative mindset that may be necessary to trigger the encoding of propositional truth values, which
would render the reported finding as support for a conditionally
(i.e. task dependent) automatic truth validation process.
Furthermore, because the material used by Richter et al. (2009) did
not contain negated sentences, truth/falsity and matches/mismatches
between sentence subject and object/predicate were confounded. It
thus remains unclear whether participants really evaluated the truth
value of the sentences or whether ‘true’ and ‘false’ response tendencies
were simply triggered by matches/mismatches between the word components of the sentences (Fischler et al., 1983).
The present study investigated processes of evaluating the truth of
simple sentences, using behavioral (irrelevant stimulus–response compatibility), as well as electrophysiological (ERP) indicators of truth
evaluation processes. Specifically, participants had to read short sentences that were either true or false. Orthogonal to their truth value,
half of the sentences contained a negation, whereas the other half did
1
Richter et al. (2009) used sentences that ascribed features to objects (‘An OBJECT is/has FEATURE’), which differs
from the sentences that were used in many previous studies that employed sentences describing category-exemplar
relations (‘An EXEMPLAR is a CATEGORY’). Still, the materials used by Richter and colleagues were comparable with
previous studies in that even wrong sentences were always combinations of objects and features that violated
participants’ world knowledge but did not contain meaningless combinations of objects and features (e.g. ‘Ideas
sleep in cars.’).
ß The Author (2012). Published by Oxford University Press. For Permissions, please email: [email protected]
648
SCAN (2013)
not. This manipulation allowed us to disentangle the effects of truth
value and semantic match/mismatch (e.g. true/semantic match:
‘Saturn is a planet’; false/semantic match: ‘Saturn is not a planet’;
true/semantic mismatch: ‘Saturn is not a continent’; false/semantic
mismatch: ‘Saturn is a continent’).2
In the main task, we presented either the word ‘true’ or the word
‘false’ as probe stimulus directly after reading a sentence. Participants
had to respond only to the probes, thereby rendering semantic consistency and truth value of the preceding sentences completely irrelevant for the task. This allowed us to investigate compatibility effects
between task-irrelevant truth values of sentences, task-irrelevant semantic consistency of sentences and the true/false responses required
by the probes. Further, this specific paradigm allows the assessment of
task-irrelevant Stimulus-Response compatibilities without the need
for an explicit evaluative categorization task. Thus, deviating from
the study by Richter et al. (2009), truth validation processes could
be measured in the absence of any evaluative task or mindset.
Simultaneously, ERPs were recorded during the reading of the last
word of each sentence to assess electrophysiological markers of processing the truth values of propositions.
To investigate whether processing of truth values occurs in an unconditionally automatic fashion or whether it depends on adopting an
evaluative mindset (conditional, goal-dependent automaticity; Moors
et al., 2010), different mindsets were induced by an additional task that
either required evaluating the truth value of the sentences or not.
Specifically, one group of participants had to evaluate some of the
sentences with regard to whether they were indeed true or false
(truth evaluation group), whereas another group of participants had
to indicate for some of the sentences whether they were identical with a
probe sentence or not (no evaluation control group).
METHOD
Participants
In total, 35 students from various departments of the FriedrichSchiller-University of Jena, Germany, participated in the study. Two
were excluded due to recording errors and three due to excessive and
uncorrectable electroencephalographic (EEG) artifacts. Randomly, half
of the participants were allocated to the ‘truth evaluation group’ (five
male) and the other half to the ‘no evaluation control group’ (three
male). The mean age of participants was 23.3 years (range 19–37 years,
no age differences between groups). All participants had normal or
corrected to normal vision and reported no history of neurological
diseases or medication. They provided written informed consent according to the Declaration of Helsinki prior to the investigation and
were reimbursed with 12 E (17 US $) after the completion of the
experiment.
Sentence stimuli
Sets of experimental stimuli were created by starting with a pair of true
sentences without negation. Swapping the object of the sentences created two additional statements that were false (Table 1, left column).
Finally, the resulting four sentences were negated, creating two true
and two false statements with negation (Table 1, right column). Thus,
each pair of statements resulted in a total of eight sentences varying on
the factors Truth (true or false) and Negation (absent or present).
Additionally, this design also captures whether sentences contain a
match or mismatch between subject and object, reflected by the
2
Like in previous studies, we used only sentences that contained no semantic anomalies, i.e. even the wrong
sentences did not contain logically impossible combinations of sentence subjects and attributes/categories.
Importantly, sentence objects/predicates were always of the same semantic category for matching/mismatching
and true/false sentences.
D.Wiswede et al.
Table 1 Sentence generation based on truth value und negation
Truth
True
False
Negation
Absent
present
Africa is a continent.
Saturn is a planet.
Africa is a planet.
Saturn is a continent.
Africa is not a planet.
Saturn is not a continent.
Africa is not a continent.
Saturn is not a planet.
interaction of Truth and Negation (match: true þ absent and false
þ present, mismatch: true þ present and false þ absent). Note that
by using this method of stimulus creation the occurrence of each sentence subject and object is perfectly balanced across all four conditions,
eliminating any confounds between experimental conditions and word
material. About 16 pairs of true sentences were used for creating the
material, yielding 16 8 ¼ 128 sentences in total (see Appendix A for a
full list of all sentence stimuli). During the course of the experiment,
each sentence was presented four times, resulting in 512 trials.
Trial structure
Each trial was divided into a reading phase and a response phase
(Figure 1). In the reading phase, participants were instructed to silently
read the sentences, which were presented word-by-word on a computer monitor. To account for differences in reading time caused by
varying word lengths, individual presentation time for each word of a
sentence was calculated from its number of letters: 150 ms þ 25 ms for
every letter (Nieuwland and Kuperberg, 2008). German language
allows sentence negation without adding another word simply by prefixing the letter ‘k’ to a noun’s indefinite article. For example, the
German sentence ‘Saturn ist ein Planet’ (English. ‘Saturn is a planet’)
can be negated by replacing ‘ein’ with ‘kein’ (‘Saturn ist kein Planet’;
English. ‘Saturn is not a planet’). To keep processing time for negated
and non-negated sentences constant, the word ‘ein’ was presented for
the same duration as the word ‘kein’ (250 ms). Since ERPs were averaged to last word onset, the last word of the assertion was presented for
a fixed time of 300 ms, followed by a blank screen of 1200 ms. All
words were presented in black font (Courier New, 24 pt) on a white
background. The 512 sentences were presented in an individually randomized order. Participants were allowed to take a short break after
each block of 75 trials.
Task
Following the reading phase, a probe stimulus signaled which of the
two tasks participants had to perform (Figure 1B). In half of the trials,
participants of both mindset groups saw either the probe word ‘falsch’
(English. ‘false’) or the probe word ‘richtig’ (English. ‘true’), requiring
them to perform a probe word identification task by pressing the
corresponding button (‘false’ ¼ left, ‘true’ ¼ right). Probe words were
chosen randomly and thus independently of the truth value or negation status of the preceding sentences, effectively rendering the truth
value of the sentences irrelevant for the task. Nevertheless, sentence
truth values matched the required responses in half of the probe word
identification trials (compatible condition: true sentence‘true’, false
sentence‘false’) and were opposite to the required responses for the
other half of the trials (incompatible condition: true sentence‘false’,
false sentence‘true’).
For the other half of the trials, the presented probes and the
to-be-performed task depended on the actual mindset group participants had been assigned to. In the ‘truth evaluation group’, participants saw the phrase ‘richtig oder falsch?’ (English. ‘true or false?’) as
Evaluating the truth of propositions
Fig. 1 Trial structure of the experiment. (A) Word presentation during the reading phase. (B) Tasks
for both groups in the response phase (TP, ‘true’-probe; FP, ‘false’-probe; TS, true sentence; FS, false
sentence; SS, same sentence; DS, different sentence).
probe and had to conduct a ‘truth validation task’, pressing the right
(left) button if the preceding sentence was true (false). In the ‘no
evaluation control group’, participants saw either the same (50%) or
another sentence from the same set of sentences (Table 1) as probe and
had to answer the question ‘Is this the sentence that you’ve just seen?’
by pressing the ‘true’ (‘false’) button if the probe sentence matched
(did not match) the preceding sentence. Non-matching probe sentences differed from the presented sentence with regard to the first
or last word (e.g. for ‘Seawater is salty.’ non-matching probes were
either ‘Ice-cream is salty.’ or ‘Seawater is sweet.’), ensuring that participants had to attend to all parts of the sentences.3
For both groups, trials of the probe word identification task and of
the additional mindset induction task (truth evaluation or sentence
comparison) were randomly intermixed, making it impossible to predict during the reading phase which task was required by the probe.
This ensured similar processing of all sentences during reading in both
tasks, allowing us to make use of all trials for ERP analyses of sentence
processing.
In all trials, the presentation of the probes was terminated by the
response. Incorrect responses were followed by the word ‘Fehler’
(English. ‘Error’) shown in red for 1500 ms. Reactions occurring
prior to probe onsets were followed by a message alerting participants
to wait for the probes before responding (2500 ms). Participants completed 16 practice trials prior to the experiment and repeated practice
until reaching 85% accuracy and an average reaction time <1500 ms.
On average, participants took two rounds of practice to meet criteria.
Total time-on-task was 60 min.
ERP recording and preprocessing
After application of the electrodes, participants were seated in an electrically shielded room with dimmed light. To reduce movement artifacts, the head was positioned on a chin rest 90 cm from a 17 inch
monitor. Participants rested their hands on a response device similar to
a computer mouse, responding with their left and right index fingers.
Continuous EEG activity was recorded using an ActiveTwo head cap
and the ActiveTwo BioSemi system (BioSemi, Amsterdam, The
Netherlands). Signals were recorded from 64 positions including all
3
Note that because stimulus format differed between sentence presentation (rapid serial visual presentation) and
sentence comparison (full sentence shown), the sentence comparison task could not be performed on the basis of
simple visual matching.
SCAN (2013)
649
standard locations of the 10/10 system using active electrodes in an
elastic cap. Eye movements were recorded with electrodes affixed to the
right and left external canthi (hEOG, horizontal electrooculogram,)
and at the supra- and infraorbital ridges of the left eye (vEOG, vertical
electrooculogram). Two additional electrodes were placed on the left
and right mastoid. As usual for BIOSEMI, two additional electrodes
(CMS, common mode sense; DRL, driven right leg) were used as reference and ground electrodes during recording. Bioelectric signals were
amplified with a sampling rate of 256 Hz and stored using ActiView
software (BioSemi).
Prior to ERP analysis, EEG data were re-referenced off-line to the
linked mastoid, a high-pass filter (0.1 Hz to remove low frequency
drifts) and a notch filter (peak at the powerline frequency of 50 Hz)
were applied. Eye artifacts were corrected using blind component separation (SOBI; Joyce et al., 2004), which shown to be superior to other
artifact correction procedures (Kierkels et al., 2006). Rejection of
non-EOG-artifacts was accomplished using individualized peak-topeak-amplitude criteria based on visual and semi-automatic inspection
implemented in BESA software (www.BESA.de). To remove high frequency noise, ERPs were 30 Hz low-pass filtered prior to statistical
analysis and graphical display. Grand-average ERPs were generated
separately for both groups as were the mean ERPs across participants
in each condition.
Data analysis
Behavioral data
To examine evidence for truth validation processes in the behavioral
data, reaction times (RTs) of correct responses in the probe word
identification task were analyzed. Within-subject factors were truth
value of the assertion in the reading phase (true vs false), negation
status of the assertion (absent vs present) and probe word on the
screen (‘true’ vs ‘false’); between-subjects factor was mindset, either
evaluative or non-evaluative (‘truth evaluation’ vs ‘no evaluation control’). Truth evaluation processes are reflected by a Truth Probe
interaction, indicating compatibility effects of the Simon type between
a task-irrelevant stimulus feature (truth value) and a response feature
(‘true’ key).
ERP data
ERPs are reported starting at the onset of the last word of each sentence. Because trials of both tasks were randomly intermixed, participants could not predict the to-be-executed task before the onset of the
probes. Consequently, all sentences followed by correct responses in
the subsequent task were included in the ERP analyses, regardless of
the actual task. Analyses of variance comprised the factors truth, negation and mindset.
For N400 analyses, mean amplitudes on midline electrodes Fz, Cz
and Pz were calculated in a time window from 300 to 400 ms after
onset of the last word. In addition, visual inspection was conducted to
indicate the differences between true and false sentences that were
independent of sentence negation. Since there was no a priori hypothesis about possible effects and scalp distributions beyond the
N400 time window, differences between true and false sentences were
analyzed on a selection of 12 frontal, central and parietal electrodes
separately for both negation conditions and both groups; time
windows were based on visual inspection.
ERP data were analyzed using EEGLAB (Delorme and Makeig, 2004)
and BESA software, and visualized using Bplot (http://besa.de/updates/
tools/bplot.php). Statistical analysis of behavioral and ERP data was
conducted using SPSS 15. Degrees of freedom are provided uncorrected; whenever necessary, P-values are Greenhouse–Geisser corrected
to account for possible violations of the sphericity assumptions.
650
SCAN (2013)
D.Wiswede et al.
mindset groups, sentence negation or sentence truth. The amplitudes
to the last word started to differentiate around 300 ms. Most obviously,
false sentence endings elicited more negative amplitudes than true
endings in affirmative sentences, with the reversed pattern in negated
sentences. Thus, there was an increased N400 best seen over
centroparietal electrodes, whenever the last word (i.e. the sentence
object) semantically mismatched the subject of the sentence.
Although both groups showed this pattern, it was more pronounced
in the truth evaluation group. In addition, negated sentences (regardless of being true or false) elicited a slightly increased N400 relative to
affirmative sentences. Finally, following the N400 time window, there
was a more negative waveform for false sentences in the truth evaluation group only. This negative trend was seen for affirmative and
negated sentences alike, but it occurred earlier in time and stronger
in amplitude for affirmative sentences. This negativity will be referred
to as ‘late negativity’.
Fig. 2 Average RTs in the probe word identification task depending on truth value, response type,
negation and mindset condition.
RESULTS
Behavioral data
Average RTs in the probe word identification task for all combinations
of the four factors (truth, negation, probe, mindset) are shown in
Figure 2. An ANOVA revealed a significant Truth Probe Mindset
interaction, F (1,28) ¼ 68.5, P < 0.01, partial 2 ¼ 0.71. To follow up on
this finding, separate ANOVAs were conducted for each mindset
group. As expected, a significant Truth Probe interaction was obtained in the truth evaluation group, F (1,14) ¼ 78.9, P < 0.01, partial
2 ¼ 0.85, indicating compatibility effects between task-irrelevant sentence truth and probe word: participants in this group identified the
probe words ‘true’ and ‘false’ faster when they matched the truth content of the presented sentence (M ¼ 459 ms) than when they did not
(M ¼ 550 ms). No compatibility effects emerged in the no evaluation
control group, F < 1. Thus, matches or mismatches between the truth
content of the sentence and the response prompt did not facilitate or
interfere with responding in the probe word identification task in this
group, indicating that the truth value of the sentences was not evaluated in an unconditionally automatic fashion.4
Furthermore, the Negation Mindset interaction was also significant, F (1,28) ¼ 11.7, P < 0.01, partial 2 ¼ 0.30. ANOVAs per mindset
group indicated that participants in the evaluation group responded
slightly delayed to probes following negated sentences, F (1,14) ¼ 18.7,
P < 0.01, partial 2 ¼ 0.57; this was not the case for the no evaluation
control group, F < 1.
ERP data
Visual inspection
Largest effects were found on midline electrodes (Figure 3). After the
N1-P2 complex, there was a negative deflection starting around 220 ms
after the onset of the last word with no latency differences between
4
To check whether the Truth Probe Mindset interaction was stable across the experiment, we compared
reaction times of the first presentation of each sentence with the last (fourth) presentation of each sentence. The
TRUTH * PROBE * MINDSET * PRESENTATION (first or last) interaction was far from being significant (P > 0.64).
Thus, truth evaluation processes in the evaluative mindset condition occurred already at the first presentation of a
sentence in the experiment and did not depend on recognizing or recalling the truth value of sentences from
previous occurrences.
Statistical analysis of the N400 time window
Visual inspection was confirmed by statistical analyses using mean
amplitudes on midline electrodes as dependent variable. For affirmative sentences, N400 amplitudes were larger for false than for true ones,
whereas for negated sentences, N400 amplitudes were smaller for false
than for true ones (Truth Negation), F (1,28) ¼ 64.4, P < 0.001, partial 2 ¼ 0.70: Thus, N400 amplitudes were elevated for sentences containing a semantic mismatch between subject and object, irrespective
of their corresponding truth value. This effect was further qualified by
a Truth Negation Mindset interaction, F (1,28) ¼ 16.0, P < 0.01,
partial 2 ¼ 0.36, indicating that the influence of semantic mismatch
on N400 amplitudes was stronger in the truth evaluation group, F
(1,14) ¼ 76.0, P < 0.001, partial 2 ¼ 0.84, compared with the no evaluation control group, F (1,14) ¼ 7.7, P < 0.01; partial 2 ¼ 0.36. Separate
ANOVAs for each group on the same midline electrodes (Fz, Cz, Pz)
showed that the difference between true and false sentences was significant for sentences with and without negation (evaluation group/
sentences without negation: F (1,14) ¼ 50.98, P < 0.01; evaluation
group/negated sentences: F (1,14) ¼ 31.5, P < 0.01; no evaluation control group/sentences without negation: F (1,14) ¼ 10.92, P < 0.03; no
evaluation control group/negated sentences: F (1,14) ¼ 7.97, P < 0.097;
electrode truth interaction for negated sentences in no evaluation
control group: F (2,28) ¼ 3.22, P ¼ 0.062; indicating that the truth
effect was significant on electrode Pz, P < 0.04, marginally significant
on Cz, P < 0.06, and not significant on Fz). Furthermore, negated sentences overall elicited somewhat stronger N400 amplitudes than affirmative sentences, F (1,28) ¼ 5.4, P < 0.05, partial 2 ¼ 0.16. There
was no general N400 difference between both groups (main effect
group: F < 1).
Statistical analyses of the late negativity
Statistical analyses confirmed visual inspection by a significant Time
Window (mean amplitude 500–800 ms vs 800–1000 ms) Truth Negation Mindset interaction, [F (1,28) ¼ 5.25, P < 0.03]. Statistics
were conducted on a selection of frontal, central and parietal electrodes
(Fp1, Fpz, Fp2, F1, Fz, F2, C1, Cz, C2, P1, Pz, P2), grouped by the
factors ANTERIORITY (4 levels; frontoploar, frontal, central, parietal),
and LATERALITY (3 levels; left, middle, right). To further elucidate
the nature of this complex interaction, true and false sentences were
compared separately for both time windows, both negation levels and
for both groups on the same set of electrodes. More negative ERPs for
false sentences were found in the truth evaluation group only; in the
early time window for sentences without negation, F (1,14) ¼ 8.36,
P < 0.012, and in the late time window for negated sentences, F
(1,14) ¼ 11.52, P < 0.01.
Evaluating the truth of propositions
SCAN (2013)
651
Fig. 3 ERPs on midline electrodes locked to the onset of the last word separated by truth value, negation and mindset condition. Grey-shaded areas indicate time windows for the late negativity.
DISCUSSION
On a behavioral level, our findings provide further evidence of truth
evaluation processes as indicated by compatibility effects between the
truth value of sentences and the required response type in the probe
word identification task (facilitation/interference for matching/mismatching combinations of truth value and response). This compatibility effect, however, occurred only for the group with an evaluative
mindset (truth evaluation as an additional task), not for the no evaluation control group (sentence comparison as an additional task).
Thus, our findings further support theories postulating that processing
of truth values is goal dependent (e.g., Dodd and Bradshaw, 1980;
Green and Brock, 2000; Schul et al., 2004). Apparently, evaluating
the truth value of a proposition does not automatically occur whenever
a sentence is read and understood, but has to be regarded as a conditionally automatic process, the occurrence of which depends on the
requirements of the current task and resulting mindset.
Our results can be seen as a conceptual replication of Richter et al.’s
(2009) findings, given that participants were put in an evaluative
mindset by intermixing the probe word identification task with an
additional truth evaluation task. By manipulating negations orthogonally to the truth/falsity of sentences, an alternative interpretation of the
truth effect in terms of semantic matches/mismatches was effectively
ruled out in our study. However, we found no evidence for fully automatic or unconditional validation of sentence truth values in the no
evaluation control group, although participants in that group also
needed to process the sentences in order to successfully perform on
the intermixed sentence comparison task. Apparently, an evaluative
task that encourages participants to classify sentences (or parts of sentences; Richter et al., 2009) as true or false seems to be a prerequisite
for the emergence of truth validation effects, suggesting an interpretation of this effect in terms of a goal-dependent, conditionally automatic process. Since this conclusion rests on the interpretation of a
null finding in the no evaluation control group, we conducted an a
posteriori power analysis to evaluate the reliability of this result. The
power for detecting compatibility effects of similar magnitude as in the
truth evaluation group (d ¼ 2.29) also in the no evaluation control
group was perfect, 1 – ¼ 1.00. Similarly, the power to detect an
effect of d ¼ 0.80 (a large effect according to Cohen) was still fairly
high, 1 – > 0.90 (power analyses were conducted with G-Power 3;
Erdfelder et al., 2007). Given these results, it seems unlikely that the
null effect in the no evaluation control group is due to a lack of power.
We cannot rule out, however, that a medium or small truth effect in
the no evaluation control group went undetected in our study due to
the somewhat small sample size.
In light of our results, the findings of Richter et al. (2009) should be
reinterpreted as indicating conditionally automatic rather than fully
automatic processing of the truth value of sentences during reading.
Having participants evaluate the correctness of the spelling of words
might have induced the evaluative mindset necessary for the occurrence of truth validation processes with regard to the sentences in their
study. Here, we could assess the same effects in the absence of any
evaluative task by looking at irrelevant S–R compatibilities of sentence
truth values. However, omitting an evaluative mindset apparently also
eliminated truth validation processes.
These conclusions were further corroborated by the ERP findings of
our study. To our knowledge, this study is the first to provide evidence
for a specific ERP marker of truth evaluation for simple propositions.
False sentences were accompanied by a late negativity in the time
window of 500–800 ms after sentence completion for affirmative sentences, and in the time range between 800 and 1000 ms for negated
sentences. This delay for negated sentences fits nicely with previous
studies showing that negated sentences take longer to process than
affirmative ones (Smith et al., 1974; Fischler et al., 1983, 1985; Rapp,
2008; Richter et al., 2009). Although comparable in many respects, the
study by Fischler et al. (1983) did not report ERP components separating true from false assertions, unlike the present study. The authors
suggested that this might have been due to the task that was used in
their study, which required participants to respond immediately after
the decision, such that EEG components related to the response and to
response preparation might have concealed truth/falsity-related ERP
differences. This problem was avoided in the present study by delaying
responses after reading a sentence for 1500 ms (Figure 1).
652
SCAN (2013)
Importantly, an elevated late negativity for false sentences occurred
only if an evaluative mindset was induced by intermixing the word
decision task with an additional truth evaluation task. Replicating the
pattern of findings for the behavioral data, there was no evidence for
an increased late negativity following false sentences in the no evaluation control group. This indicates that truth evaluation processes do
not occur spontaneously and in the absence of an explicit evaluative
mindset, but rather reflect a conditionally automatic, goal-dependent
process.
An alternative explanation for the group differences in truth validation processes obtained in the behavioral and EEG data assumes that
the two tasks differed in the degree to which semantic processing was
required. According to this account, truth validation processes occur
whenever sentences are processed semantically, even in the absence of
an evaluative mindset.5 We cannot rule out this alternative explanation
of our findings. One result of our study, however, speaks against the
assumption that the two groups differed substantially with regard to
the amount of semantic processing: previous studies that investigated
EEG correlates of semantic processing found that semantic compared
with non-semantic processing conditions increased N400 amplitudes
(e.g. Holcomb, 1988; Chwilla et al., 1995). In our study, however, there
was no difference in the overall magnitude of the N400 between the
two groups, which speaks against the hypothesis that the two tasks
induced different levels of semantic processing.
Furthermore, our results replicate and extend previous findings regarding the processing of mismatches on a word level (Fischler et al.,
1983, 1984, 1985; Hagoort et al., 2004; Nieuwland and Kuperberg,
2008). Similar to earlier findings, we found evidence for an automatic
detection of mismatches between the words denoting the subject and
the object or predicate of a sentence, indicated by an elevated N400 for
sentences containing a mismatch. This mismatch effect was observed
for true as well as false sentences, and was present in both mindset
groups, indicating that the analysis of word meaning and of semantic
relations between words within a sentence occurs automatically.
It should be noted, that our study did not reveal any effects of
sentence truth on the N400, neither in the evaluation group nor in
the no evaluation control group. Although this result fits nicely to what
was found in previous studies (e.g. Fischler et al., 1983), there are two
studies that did report elevated N400 amplitudes for false sentences. A
study by Hagoort et al. (2004) found elevated N400 amplitudes during
reading of words that rendered sentences either false (‘Dutch trains are
white.’) or meaningless (‘Dutch trains are sour.’), compared with reading of words that resulted in a true sentence (‘Dutch trains are
yellow.’). Since this study did not vary sentence truth orthogonally to
negations, however, the resulting effect can also be explained in terms
of simple mismatches between the subject (‘Dutch trains’) and the
predicate (‘white’) of a sentence. Based on previously reported findings
and our own results, the same elevated N400 would probably also be
obtained if a negation were added to a false sentence that would render
the false sentence true (e.g. ‘Dutch trains are not white.’). The findings
by Hagoort et al. (2004) thus do not provide unambiguous evidence
for effects of truth validation processes on N400 amplitudes.
In a more recent study, however, Nieuwland and Kuperberg (2008)
orthogonally varied (i) the truth/falsity of sentences and (ii) whether a
sentence contained a negation or not. In a standard condition with
simple sentences that were clearly true or false (e.g. ‘Bulletproof vests
are [not] safe/dangerous.’), they found only a standard mismatch effect
on N400 amplitudes. Specifically, regardless of whether a sentence
contained a negation or not, N400 amplitudes were elevated for sentences that contained mismatching pairs of words (i.e. for ‘Bulletproof
5
Differences in semantic processing also provide an alternative explanation of why semantic mismatch effects in the
N400 were stronger in the truth evaluation group.
D.Wiswede et al.
vests are dangerous’ but also for ‘Bulletproof vests are not dangerous.’). No effect of the truth value of sentences was obtained for the
N400 in this condition, which is perfectly comparable with our results.
In yet another condition, Nieuwland and Kuperberg (2008) presented more complex sentences that were introduced by phrases indicating some kind of exceptional or non-default circumstances
(e.g. ‘With proper equipment, scuba diving is [not] dangerous/
safe.’). For these sentences, an increased N400 was obtained for false
sentences regardless of whether they contained a negation or not.
Simultaneously, the mismatch effect was no longer obtained in this
condition. This pattern of findings is opposite to what was previously
reported in the literature (presence of a falsity effect, absence of a
mismatch effect in the N400). Apparently, adding a qualifying statement at the beginning of the sentence (‘with proper equipment’) provides a signal to the reader that a situation is described in which
standard matches/mismatches do not hold but are reversed (with
proper equipment, scuba diving is no longer dangerous but should
become safe instead). We thus presume that warning the reader at
the beginning of a sentence that matches/mismatches between words
do not apply due to exceptional circumstances, results in more flexible
processing, taking the sentence context into account already during
word processing: match/mismatch relations between words are
recoded and combined with negation information, leading to an influence of truth/falsity on N400 responses.
Thus, although the findings by Nieuwland and Kuperberg (2008)
suggest that the N400 can indeed reflect truth validation processes, this
effect seems to be restricted to the reading of complex sentences in
which a deviation from normality is introduced at the beginning of the
sentence. For simple sentences stating straightforward subject/object
relations, negations apparently have no influence on the N400,
which typically reflects only mismatches on a word level, regardless
of the truth of the sentence. This intriguing finding that flexible processing of sentences may introduce effects of more complex processes
into early ERP components marks an interesting avenue for future
research.
In sum, our study provided behavioral and electrophysiological evidence for truth validation processes. Going beyond previous findings,
we identified a specific EEG marker reflecting the truth vs falsity of
propositions that was apparent in an elevated late negativity for false
propositions. This truth evaluation process, however, critically depended on the current task demands and did not occur in an unconditionally automatic fashion. On the other hand, our study clearly
suggests that the obtained truth validation effects in our evaluative
mindset condition were in fact conditionally automatic, as sentence
truth affected the processing/responding despite being completely irrelevant and distractive for the task at hand (i.e. affected classification
responses referring to the target words ‘true’ and ‘false’ that were unrelated to the truth values of the previously presented sentences).
Clearly, further research is needed to identify the exact conditions
that are required for an emergence of truth validation processes
during reading. To fully explore the automaticity conditions under
which truth evaluation takes place, different indicators of automaticity
have to be varied systematically (e.g. resource dependence, levels of
awareness of differences in truth values, the goal to suppress processing
of truth values). Using simple probes to assess compatibility effects
between the truth/falsity of sentences and responses provides an elegant and flexible way to assess behavioral and electrophysiological indicators of truth processing without having to induce specific
evaluative behavioral tasks.
Conflict of Interest
None declared.
Evaluating the truth of propositions
REFERENCES
Balota, D.A., Flores d’Arcais, G.B., Rainer, K., editors (1990). Comprehension Processes in
Reading. Hillsdale, NJ: Erlbaum.
Chwilla, D.J., Brown, C.M., Hagoort, P. (1995). The N400 as a function of the level of
processing. Psychophysiology, 32, 274–85.
Delorme, A., Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial
EEG dynamics including independent component analysis. Journal of Neuroscience
Methods, 134(1), 9–21.
Dodd, D.H., Bradshaw, J.M. (1980). Leading questions and memory - pragmatic constraints. Journal of Verbal Learning and Verbal Behavior, 19(6), 695–704.
Erdfelder, E., Faul, F., Lang, A.G., Buchner, A. (2007). G*Power 3: A flexible statistical
power analysis program for the social, behavioral, and biomedical sciences. Behavior
Research Methods, 39(2), 175–91.
Fischler, I., Bloom, P.A., Childers, D.G., Arroyo, A.A., Perry, N.W. (1984). Brain potentials
during sentence verification - late negativity and long-term memory strength.
Neuropsychologia, 22(5), 559–68.
Fischler, I., Bloom, P.A., Childers, D.G., Roucos, S.E., Perry, N.W. (1983). Brain potentials
related to stages of sentence verification. Psychophysiology, 20(4), 400–9.
Fischler, I., Childers, D.G., Achariyapaopan, T., Perry, N.W. (1985). Brain potentials during
sentence verification - automatic aspects of comprehension. Biological Psychology, 21(2),
83–105.
Friederici, A.D., Pfeifer, E., Hahne, A. (1993). Event-related brain potentials during natural
speech processing: Effects of semantic, morphological and syntactic violations. Cognitive
Brain Research, 1(3), 183–92.
Green, M.C., Brock, T.C. (2000). The role of transportation in the persuasiveness of public
narratives. Journal of Personality and Social Psychology, 79(5), 701–21.
Hagoort, P., Hald, L., Bastiaansen, M., Pertersson, K.M. (2004). Integration of word
meaning and world knowledge in language comprehension. Science, 304, 438–41.
Hahne, A., Friederici, A.D. (1999). Electrophysiological evidence for two steps in syntactic
analysis: Early automatic and late controlled processes. Journal of Cognitive Neuroscience,
11(2), 194–205.
Hahne, A., Friederici, A.D. (2002). Differential task effects on semantic and syntactic
processes as revealed by ERPs. Cognitive Brain Research, 13(3), 339–56.
Holcomb, P.J. (1988). Automatic and attentional processing: an event-related brain potential analysis of semantic priming. Brain and Language, 35(1), 66–85.
Joyce, C.A., Gorodnitsky, I.F., Kutas, M. (2004). Automatic removal of eye movement and
blink artifacts from EEG data using blind component separation. Psychophysiology,
41(2), 313–25.
Kierkels, J.J., van Boxtel, G.J., Vogten, L.L. (2006). A model-based objective evaluation
of eye movement correction in EEG recordings. IEEE Transactions on Biomedical
Engineering, 53(2), 246–53.
Kutas, M., Hillyard, S.A. (1980). Reading senseless sentences: brain potentials reflect
semantic incongruity. Science, 207(4427), 203–5.
Kutas, M., Hillyard, S.A. (1984). Brain potentials during reading reflect word expectancy
and semantic association. Nature, 307(5947), 161–3.
Moors, A., Spruyt, A., De Houwer, J. (2010). In search of a measure the qualifies as
implicit: Recommendations based on a decompositional view of automaticity.
In: Gawronski, B., Payne, B.K., editors. Handbook of Implicit Social Cognition:
Measurement, Theory, and Applications. New York: Guilford, pp. 19–37.
Nieuwland, M.S., Kuperberg, G.R. (2008). When the truth is not too hard to handle:
An event-related potential study on the pragmatics of negation. Psychological Science,
19(12), 1213–18.
Rapp, D.N. (2008). How do readers handle incorrect information during reading? Memory
& Cognition, 36(3), 688–701.
Richter, T., Schroeder, S., Wöhrmann, B. (2009). You don’t have to believe everything you
read: Background knowledge permits fast and efficient validation of information.
Journal of Personality and Social Psychology, 96(3), 538–58.
Schul, Y., Mayo, R., Burnstein, E. (2004). Encoding under trust and distrust: The
spontaneous activation of incongruent cognitions. Journal of Personality and Social
Psychology, 86(5), 668–79.
Smith, E.E., Shoben, E.J., Rips, L.J. (1974). Structure and process in semantic memory - a
featural model for semantic decisions. Psychological Review, 81(3), 214–41.
SCAN (2013)
653
APPENDIX A
Table A1 List of sentences that were presented in the experiment; original sentences
were presented in German.
Subject
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Africa
Africa
Saturn
Saturn
Alcohol
Alcohol
Medicine
Medicine
Aspirin
Aspirin
Euro
Euro
Beer
Beer
Haribo
Haribo
Berlin
Berlin
Moscow
Moscow
Blood
Blood
Milk
Milk
Chess
Chess
Waltz
Waltz
Down
Down
Stones
Stones
Elephants
Elephants
Mice
Mice
Five
Five
Yellow
Yellow
France
France
Socrates
Socrates
Ice cream
Ice cream
Sea water
Sea water
Iron
Iron
Ravens
Ravens
Tennis
Tennis
Wine
Wine
The Atlantic
The Atlantic
The Sahara
The Sahara
The night
The night
The sun
The sun
NEGATION
absent
present
is a continent
is a PLANET
is a planet
is a CONTINENT
makes you drunk
makes you HEALTHY
makes you healthy
makes you DRUNK
is a medicine
is a CURRENCY
is a currency
is a MEDICINE
is a drink
is a CANDY
is a candy
is a DRINK
is in Germany
is in RUSSIA
is in Russia
is in GERMANY
is red
is WHITE
is white
is RED
is a game
is a DANCE
is a dance
is a GAME
are soft
are HARD
are hard
are SOFT
are large
are SMALL
are small
are LARGE
is a number
is a COLOR
is a color
is a NUMBER
is a country
is a HUMAN
is a human
is a COUNTRY
is sweet
is SALTY
is salty
is SWEET
can rust
can FLY
can fly
can RUST
is a sport
is a BEVERAGE
is a beverage
is a SPORT
is a ocean
is a DESERT
is a desert
is a OCEAN
is dark
is BRIGHT
is bright
is DARK
is not a PLANET
is not a continent
is not a CONTINENT
is not a planet
makes you not HEALTHY
makes you not drunk
makes you not DRUNK
makes you not healthy
is not a CURRENCY
is not a medicine
is not a MEDICINE
is not a currency
is not a CANDY
is not a drink
is not a DRINK
is not a candy
is not in RUSSIA
is not in Germany
is not in GERMANY
is not in Russia
is not WHITE
is not red
is not RED
is not white
is not a DANCE
is not a game
is not a GAME
is not a dance
are not HARD
are not soft
are not SOFT
are not hard
are not SMALL
are not large
are not LARGE
are not small
is not a COLOR
is not a number
is not a NUMBER
is not a color
is not a HUMAN
is not a country
is not a COUNTRY
is not a human
is not SALTY
is not sweet
is not SWEET
is not salty
can not FLY
can not rust
can not RUST
can not fly
is not a BEVERAGE
is not a sport
is not a SPORT
is not a beverage
is not a DESERT
is not a ocean
is not a OCEAN
is not a desert
is not BRIGHT
is not dark
is not DARK
is not bright
Bold font indicates true sentences, normal font false sentences. Upper case words indicate mismatch
between the sentence subject and the last word.