Expressive power of “now” and “then” operators

Noname manuscript No.
(will be inserted by the editor)
Expressive power of “now” and “then” operators
Igor Yanovich
July 9, 2014
Abstract Natural language provides motivation for studying modal backwardslooking operators such as “now”, “then” and “actually” that evaluate their
argument formula at some previously considered point instead of the current
one. This paper investigates the expressive power over models of both propositional and first-order basic modal language enriched with such operators.
Having defined an appropriate notion of bisimulation for first-order modal
logic, I show that backwards-looking operators increase its expressive power
quite mildly, contrary to beliefs widespread among philosophers of language
and formal semanticists. That in turn presents a strong argument for the use
of operator-based systems in the semantics of natural language, instead of
systems with explicit quantification over worlds and times that have become
a de-facto standard for such applications. The popularity of such explicitquantification systems is shown to be based on the misinterpretation of a
claim by [Cresswell, 1990], which led many philosophers and linguists to assume (wrongly) that introducing “now” and “then” is expressively equivalent
to explicitly quantifying over worlds and times.
Keywords “now” operator, backwards-looking operators, bisimulation,
first-order modal logic, hybrid logic
The purpose of this paper is to study the expressive power that is added
to modal logic by the introduction of now and then operators.1 Logically
speaking, it is actually not a particularly exciting subject. Once we apply
relatively familiar techniques from the modern logical toolkit, it turns out
Igor Yanovich
Universität Tübingen, Institute of Linguistics
Wilhelmstraße 19, Tübingen, 72074 Germany
E-mail: [email protected]
1 From here on, I talk only about expressive power over models. For the application to
natural language, that is arguably a more important kind of expressivity than expressivity
over frames.
2
Igor Yanovich
that now and then add extra expressive power only for first-order modal
logic, as opposed to propositional modal logic.2 Moreover, in most cases that
power is only added when models have an infinite number of individuals. Thus
for a pure logician, the interest of this paper would mainly lie in the notion of
bisimulation appropriate for first-order (=quantified) modal logic, cf. Def. 8
and Thm. 2.
But from the applied point of view, the systems with now and then are
extremely important because of their role in the philosophy of language and in
formal semantics. In those areas, it is often taken as a proven fact that modal
logic with now and then is as expressive as a first-order multi-sortal logic
with explicit quantification over times and worlds. But this near-consensus is
very different from the actual mathematical state of affairs, as we will show
below. Once the wrongful assumption is corrected, there are consequences for
how linguists and philosophers of language might want to go about analyzing
natural language phenomena. In particular, when the expressive power of now
and then is properly characterized, we become able to see more advantages
of using operator-based systems for modality and temporality.
1 Introduction
Certain expressions of natural language prompted philosophers and linguists
to introduce now and then operators which could shift the interpretation of
an embedded subformula to a point (that is, world or time) introduced by
a higher modal operator. Those expressions may be called backwards-looking
operators.3 From the early days of formal semantics, it has become accepted
that when such operators are added to quantified (that is, first-order) modal
logic, its expressive power increases, as was shown by [Kamp, 1971]. But by
how much? Since [Cresswell, 1990], it has also become accepted in the fields
of formal semantics and philosophy of language that quantified modal logic
enriched with now and then is as expressive as a full many-sortal first-order
logic with unrestricted explicit quantification over worlds and times. And that
formal understanding (or, in fact, misunderstanding, as we will show) in turn
led philosophers and especially linguists to widely popularize the use of such
explicit-quantification systems.
Explicit-quantification systems have become a de-facto standard to such
an extent that it is hard to find contemporary semantic work that would use
modal and temporal operators rather than explicit quantifiers over times and
worlds. This sometimes leads to curious results: for instance, [Percus, 2000]
is an important and well-cited4 work that is dedicated to formulating a binding theory for explicit world variables assumed to populate syntactic repre2 Quantified modal logic, also called first-order modal logic, is related to (propositional)
modal logic the same way as first-order logic is related to propositional logic.
3 The term “backwards-looking operators” is due to [Saarinen, 1978].
4 Currently with about 200 citations in the Google Scholar web service, which is a large
number for a linguistic article.
Expressive power of “now” and “then” operators
3
sentations for natural language. The main content of that binding theory is
as follows: verbs and adverbial modifiers combine with world variables that
must be bound by the closest higher modal operator. Importantly, in a system
where world variables are only manipulated implicitly through modal operators and backwards-looking operators, no such constraint would be needed:
in the absence of an extra operator, such interpretation would be the default. This underscores which kind of problems one runs into upon accepting
without argument a more expressive system than one needs to: in a more expressive system, more things can go wrong, and moreover, more additional
constraints on the workings of the system are needed. But given the common
misinterpretation of Cresswell’s result, this problem has flown under the radar
of philosophers and linguists because it was assumed that there is no difference in expressive power between operator-based and explicit-quantification
systems.
What I aim to achieve in this paper is to bring the debate about explicitquantification vs. operator modal systems for natural language semantics back
onto a solid logical ground. The reading of[Cresswell, 1990] that led analysts of
natural language to adopt explicit-quantification systems is based on a misunderstanding. Cresswell never proved the result that the subsequent linguistic
and philosophical literature took him to have proven. He added now and then
operators not to the basic modal language, but to a language with universal
modality — which few if any philosophers and linguists would posit for natural
language (or, at least, for everyday natural language). In that language, then
indeed increases expressivity up to full many-sortal FOL (first-order logic). But
as is well-known to modal logicians, universal modality is a very powerful operator itself, cf. [Goranko and Passy, 1992], [Blackburn and Seligman, 1995],
[Blackburn and Seligman, 1998], a.o. So Cresswell’s increase in expressive power
happens to a system that is already far more powerful than what is currently
assumed for natural language (which we will discuss further in Section 7). It
is thus improper to apply Cresswell’s results to ordinary investigations of the
properties of natural language.
But what happens if we analyze the expressive power of now and then
properly, namely adding them to the basic modal language ML? (That language would be agreed upon as a proper basis for analysis of natural language.)
The current paper answers this question. In the propositional case, no extra
power is added by then. In the quantified modal logic case, there is indeed an
increase in power, but it is tiny. ML with then operators added is the least
expressive language of the hybrid family, and when identity is in the language
of quantified modal logic, then’s extra power only manifests itself in models
with an infinite number of individuals. Far from going all the way to manysortal first-order logic, a system with now and then is perhaps the mildest of
the known systems expanding basic modal logic.
What does this mean for applied logicians, such as philosophers and linguists, who use modal logic to analyze natural language? The bottom line is
that an operator-based system is arguably superior to the explicit-quantification
systems that have become the standard in the field. Operator-based systems
4
Igor Yanovich
are less expressive, but the expressivity they carry is already enough to account for the relevant natural language phenomena. Such systems are thus
more restrictive and more predictive — the properties which linguists and
philosophers value in formal systems. This does not necessarily mean that one
should just throw explicit-quantification systems out of the window: it can be
that for some purposes, they would be more intuitive to use. But what the
results presented in this paper show is that such systems are far from innocent,
and thus ought to be used with due caution.
The plan of the paper is as follows. In Section 2, I introduce important
datapoints from natural language that motivate the introduction of backwardslooking operators. I also demonstrate how one would translate such natural
language sentences into a formal language using both an operator-based and
an explicit-quantification based systems. This section thus provides the applied
motivation for the logical study to follow. In Section 3, I introduce languages
Cr and CrF O resulting from the addition of generalized backwards-looking
operators to the basic modal language ML and its quantified version MLF O .
This section provides the definitions for the formal systems whose expressivity
we will be studying. In Section 4, I provide a truth-preserving translation
from the fragment of propositional Cr that only features genuinely backwardslooking then, into ML. The existence of such translation shows that then
operators do not actually increase expressive power in the propositional case.
Section 5 turns to the case of first-order modal logic with now and then.
We introduce a notion of bisimulation appropriate to quantified MLF O , and
with its help prove that CrF O is strictly more expressive, but at the same
time that extra expressivity only kicks in in a limited number of cases —
in particular, when the domains of individuals are infinite. Section 6 closes
the logical part of the paper: it it, I situate the Cr languages within the
family of hybrid languages. It turns out that Cr is the mildest member of
that clan. Finally, in Section 7 we return to the applied use of backwardslooking operators, discussing how the expressivity claim by [Cresswell, 1990]
was misinterpreted in the linguistic and philosohical literature, and what the
practical consequences of learning the actual expressivity results may be.
2 Backwards-looking operators of natural language, and their
formal analysis
In this section, I introduce natural language examples of the kind used to
motivate the introduction of now and then operators. Then show how those
examples can be analyzed with such operators, and also, alternatively, in a
full many-sortal first-order logic with explicit quantification over worlds and
times.
Mary [who is reading now] came.
(1)
In 1, “now” is embedded within a relative clause and signals that the embedded predicate “is reading” should be evaluated at the current time, outside
Expressive power of “now” and “then” operators
5
of the scope of the matrix past tense. Assuming operator now with appropriate
semantics, we can represent the sentence with P (come(m) ∧ now(read(m))),
where P is the past operator. There is also a equivalent logical representation
for 1 without now: read(m) ∧ P (come(m)), but while semantically correct,
it does not respect the constituent structure of the original natural language
sentence.
(One day in the future,) everyone [now alive] will be dead
(2)
Unlike for 1, for 2 there is no translation into quantified modal logic without
a now operator. Or, rather, strictly speaking we will only be able to prove that
after we derive the results in Section 5, but even in the absence of a formal
proof, it has been widely accepted as fact for decades that 2 is inexpressible
unless we add now (or allow explicit quantification over times).5
It was the case that everyone [then alive] would all be dead one day
Everyone [actually tall] might have been short
You might have considered yourself short while [actually being tall]
(3)
(4)
(5)
While “now” in 1 and 2 shifts evaluation back to the matrix time, “then”
in 3 shifts it to the moment introduced by the higher past tense. In 4, the word
“actually” forces the predicate “tall” to be evaluated at the actual world. In 5,
the same word returns the evaluation to the counterfactual world introduced
by the higher “might have been” operator, similarly to how “then” in 3 refers
back to the past moment introduced by a higher operator.
In all of the cases above, natural language expressions “now”, “then” and
“actually” shift the evaluation index to some index that was used while evaluating the higher levels of the sentence. This can be the matrix index as in 1, 2
and 4, or an index introduced by a higher temporal or modal operator, as in 3
and 5. In both cases, those words may be said to be looking back into the series
of indices introduced earlier, and shifting the interpretation of their argument
formula to one of the previously used indices and away from the current one.
In an operator-based system, we would account for such natural language
expressions as follows. We would introduce a family of logical backwardslooking operators theni (which is exactly what we do formally in the next
section). The idea would be that theni would shift the interpretation to the
i-th evaluation index from the ones that were used earlier. The special case
of now would be defined as then0 that would always go back to the initial
evaluation time. Then we would analyze 2 as follows:
[[now]] = λPet . now(P )
(6)
[[alive now]] = λxe . now(alive(x))
(7)
[[everyone now alive will be dead]] = F (∀x : now(alive(x)) → dead(x)) (8)
5 Contrary to that, [Verkuyl, 2008, pp. 130-132] argues that now is semantically superfluous in modal logic. But Verkuyl does not discuss any quantificational examples like 2 which
pose a true expressivity problem, and only considers sentences such as 1 for which now is
indeed semantically superfluous.
6
Igor Yanovich
In an explicit-quantification system, there are certain design choices to be
made when implementing “now” and “then”. In most current systems used
by formal semanticists (cf. [Percus, 2000] for an example), predicates such as
“alive” have special syntactically represented slots for time and world variables. Usually such a slot would be filled by a postulated covert constituent
introducing an explicit time or world variable. There are several options as to
what exactly an adverb such as “now” (for the temporal case) or “actually”
(in the world-variable case) would be doing in this setup. If “now” or “then”
would modify the predicate after it has combined with the covert variable, it
would have to force abstraction over that variable. Such a system would look
as follows:6
∧t
φ := expression that denotes the temporal intension of φ (9)
[[now]] = λPet .∧t P (t0 ), where t0 must refer to the current time
(10)
[[alive now]] = λxe .λt3 .(alive(x)(t3 ))(t0 ) = λxe .alive(x)(t0 )
(11)
[[2]] = ∃t1 t0 : (∀x : (alive(x)(t0 ) → dead(x)(t1 ))
(12)
So what “now” does on such an analysis is essentially erasing any effect of
the explicit syntactic time variable that combines with “alive”: the result is as
if there was never such a variable in the first place.
Another version of the explicit-quantification story would go as follows:
we would say that “now” directly denotes a temporal variable, and that it
occupies the same syntactic slot that is normally occupied by covert explicit
temporal variables. This would result in the following analysis:
[[now]] = t0 , where t0 must refer to the current time
[[alive now]] = λxe .alive(x)(t0 )
[[2]] = ∃t1 t0 : (∀x : (alive(x)(t0 ) → dead(x)(t1 ))
(13)
(14)
(15)
An immediate problem with this analysis is that empirically “now” seems
to be a syntactic adjunct rather than an argument. For instance, it may occur
both on the left and on the right of the modified expression: “everyone now
alive” and “everyone alive now” are both OK. So some syntax-semantic interface story would have to be told about how come “now” would always fit into
the proper slot for temporal variables — but let’s assume for the sake of the
argument that such a story may somehow be told.
Comparing the operator-based and the explicit-quantification lines of analysis, what can we say? Both lines would have to make a stipulation about the
6 Perhaps more in the spirit of current explicit-quantification systems, in particular of the
branch of formal semantics called LF semantics, would be the following alternative. There
would be no intension operator ∧t ; now would take as arguments functions from times;
there will be a rule of freely applying λti operators; and finally, a constraint would force
the explicit variable next to “alive” to be bound by the closest λ-restrictor specifically when
“alive” is modified by “now”, but not otherwise. The problem with such an account is that
this last constraint is, so to speak, non-compositional: the variables on predicates like “alive”
should generally not be subject to the closest-binder requirementm and only when we know
that that predicate is in the scope of “now”, would a different constraint be imposed.
Expressive power of “now” and “then” operators
7
fact that now must specifically refer to the current time, so on this count the
two are equal. Beyond that, the operator story works right away (assuming the
operators are properly defined, of course — but we will define a formal system
with them right in the next section.) The explicit-quantification story cannot
stop yet, though: it has to introduce a number of syntactic assumptions.7
This difference in the amount of additional work is actually related to
a genuine difference in the expressive power of the two underlying formal
systems. A system with backwards-looking operators, as we will show below,
is a very mild logical system. It only increases the expressivity of basic firstorder modal logic by a tiny bit. As a consequence, there are plenty of things
that we cannot express in such a system — but the good news is that for
analysis of natural language, we do not even want to express them. In contrast
to that, the explicit-quantification system is very powerful. And as the result
of that, we need to constrain its behavior in order to tailor it to the observed
facts. Hence all the additional constraints on the binding theory of world
and time variables, and a fair number of further issues to be resolved. For
instance, why doesn’t natural language have operators that have meaning ∃ti ,
without any connection to the current evaluation time? After all, it is often
assumed in that line of theorizing that we have freely applying λti operators...
If natural language is as expressive as a full many-sortal first-order logic, then
the absence of this, and many other kinds of meanings, is mysterious, and
needs to be explained with yet further constraints.
What this comparison tries to demonstrate is ultimately that expressive
power matters. Normally, a linguist or a philosopher of language would not
be interested in the issues of expressivity, and for a good reason: if natural
language demands a very expressive system, then we as analysts cannot help it,
and would have to adopt it. That would just be an empirical fact about human
language. But with backwards-looking operators, we have a different kind of
case: the now-standard tools for treating them are vastly more expressive than
is actually required by natural language data. In such a case, moving back to
a less expressive system may would give to us greater explanatory adequacy.
So from the applied point of view, the task of this paper is to develop
the logical theory that shows why and how exactly the operator-based story
is more restrictive than the explicit-quantification story. Sections 3-6 below
will take care of that logical part, and then in Section 7, we will return to
the application to natural language semantics. I tried my best to make the
logical part accessible for a linguist or a philosopher with applied interests in
mind (to the possible frustration for the logician readers, for which I apologize; textbook-level explanations have only been included for those topics
7
In fairness, some of those assumptions would be “independently justified”, in the sense
that for many intensional phenomena of natural language, we would need similar ones anyway. For example, adding an extra binding-theoretic constraint specifically for time variables
in the scope of “now” and “then” is not such a wild idea if we already adopted a number
of such constraints anyway. But if we do not have to introduce any of this apparatus for
restricting the enormous expressive power of the full FOL in the first place, that’s a different
story.
8
Igor Yanovich
that are not widely known among linguists and philosophers.) But in case I
failed nevertheless, the main technical results are informally reformulated at
the beginning of Section 7 for the reader’s convenience.
3 Adding backwards-looking operators to ML
In this section, we define languages that formalize backwards-looking operators, Cr and CrF O (for Cresswell, as our system is very close to his system with “now”, “then” and “actually”). Cr and CrF O result from enriching
the basic modal language ML and its quantified counterpart MLF O with
backwards-looking operators. It should be clear how to add such operators to
other underlying modal languages (e.g., languages having more than one ♦).
Definition 1 (The syntax of Cr)
Let P ROP be a non-empty set of propositional variables, e.g. p, q, ...
Then wff-s of Cr are:
φ := P ROP | > | ¬φ | φ ∧ ψ | ♦φ | thenk (φ), where k ∈ N.
⊥, ∨, →, and are defined as usual, and now := then0 .
Formulas of ML are evaluated in a Kripke model, —consisting of the domain W of points, an accessibility relation R, and a valuation function V ,—
at a point from W . Informally, backwards-looking operators thenk shift the
evaluation of their argument formula back to some point considered earlier.
In order to return to such points, we need to store them, and we do that in
denumerable evaluation sequences ρ of points from the domain W of model M .
Formulas of Cr are evaluated at pointed sequences hρ, ii, where the i-th member of ρ, also written ρ(i), functions as the current evaluation point in standard
Kripke semantics. We call ρ1 and ρ2 n-variants (in symbols, ρ1 ∼n ρ2 ) if for
any m 6= n we have ρ1 (m) = ρ2 (m).
Definition 2 (The semantics of Cr)
For Kripke model M = hWM , RM , VM i, sequence ρ from WM , and i ∈ N,
M, hρ, ii |=Cr q
iff ρ(i) ∈ VM (q)
M, hρ, ii |=Cr >
always
M, hρ, ii |=Cr ¬φ
iff it is not the case that M, hρ, ii |=Cr φ
M, hρ, ii |=Cr φ ∧ ψ
iff M, hρ, ii |=Cr φ and M, hρ, ii |=Cr ψ
M, hρ, ii |=Cr ♦φ
iff there is ρ0 ∼i+1 ρ s.t.
ρ(i)Rρ0 (i + 1) and M, hρ0 , i + 1i |=Cr φ
M, hρ, ii |=Cr thenk (φ)
iff M, hρ, ki |=Cr φ
Expressive power of “now” and “then” operators
9
The non-modal clauses of our semantics do exactly the same job as the corresponding standard clauses, with ρ(i) playing the part of the current point.
The clause for ♦, in addition to the standard truth conditions, also produces
“side effects”: it writes down the R-accessible point to which ♦ shifts the
evaluation as the next member of the sequence, and stores the previous evaluation point for future use. As these side effects only affect then-operators,
the following easily follows:
Proposition 1 For all φ ∈ Cr that are also in ML,
M, hρ, ii |=Cr φ iff M, ρ(i) |=ML φ
then operators shift the pointer to a different member of ρ. When the shift
is to a point stored earlier, thenk functions as a genuine backwards-looking
operator. But if we never “overwrote” ρ(k) while evaluating our formula by
the time we encounter thenk , the point that we shift to is determined by the ρ
we started with. Evaluation sequences thus work pretty much like assignment
functions, and we can think of thenk operators as implicitly introducing a
variable over points. When thenk retrieves a previously stored ρ(k), the implicit variable is bound by a higher ♦. When thenk accesses ρ(k) determined
by the initial evaluation sequence, the implicit variable is free.
A formula of Cr is a sentence iff, evaluated at hρ, 0i, it only depends on the
point ρ(0). In other words, the truth of a sentence of Cr is semantically relative
to a single point, while the truth of a non-sentence is relative to multiple points.
In yet other words, a sentence of Cr would have no implicit free variables
over points.8 Note that it is crucial that we restrict our attention to formulas
evaluated at hρ, 0i: whether a Cr formula features an implicit free variable
depends on the initial index. Thus ♦♦ then1 (p) would not use points not
introduced by ♦s when evaluated at hρ, 0i, but it would do so when evaluated
at hρ, 2i: in that case the point ρ(1) would not have been overwritten by the
clause for ♦.
Summing up, ♦ then3 (p) is not a sentence, and ♦♦ then1 (p) is. We will
write Crsent for the sentence fragment of Cr. For a sentence φ, we say φ is
true at ρ in M if φ is true at hρ, 0i in M . When ρ(0) = w, we can also say that
sentence φ is true at w, as we do for ML. When the context makes it clear
which model is to be used, we may suppress it.
A standard technique in modal logic is to relate the modal language we
are working with to the first-order logic whose domain is points/worlds/times.
That technique uses the so-called standard translation which maps modal operators to FOL operators in a specially defined language. E.g., for propositional
variables pi in modal logic, the corresponding language will have corresponding
8 It is usual to give a syntactic definition of a sentence, where a sentence is a formula
without free variables. Semantically, such a formula does not depend on the assignment of
values to variables. It is easy to give a purely syntactic definition of a Cr sentence, but I
find that the semantic definition in the main text makes the intuition behind the notion
more prominent.
10
Igor Yanovich
1-place predicates Pi over worlds. The points of a Kripke model may be referenced in L0 using individual variables xi , and the accessibility relation is represented by a 2-place predicate R. (See, e.g., [Blackburn and van Benthem, 2007,
Sec. 2.2] or [Blackburn et al., 2001, Ch. 2.4] for an introduction.) The standard
translation of standard modal logic ML essentially shows that in a sense, ML
is just a special notation for a particular fragment of FOL: the formulas that
may be output by the translation are a very small subset of the full FOL. In
particular, such FOL formulas will all feature exactly one free variable (corresponding to the initial evaluation world), and all new variables xj will always
be introduced using a link to an already introduced point xi , by a construction
∃xj : xi Rxj , corresponding to ♦.
To study the properties of our system Cr, we can easily extend the standard
translation:
STi (p) = P (xi )
STi (>) = >
STi (¬φ) = ¬STi (φ)
STi (φ ∧ ψ) = STi (φ) ∧ STi (ψ)
STi (♦φ) = ∃xi+1 (xi Rxi+1 ∧ STi+1 (φ))
STi (thenk (φ)) = STk (φ)
The standard translation for ML only requires two variables over points.9
However, the translation for Cr may require any finite number of variables.
The corresponding fragment of FOL is thus much greater for Cr than for ML.
However, we will see in the next section that despite appearances, for any
sentence of Cr there is an equivalent ML formula, so standard translations of
Cr sentences are equivalent to formulas in the two-variable fragment of FOL.
We now turn to the quantified language CrF O : it is the quantified version
that is needed to adequately model meanings of NL sentences such as 1-5.
Syntactically, CrF O is obtained by using a supply of individual variables xk
and n-place relation symbols q instead of just propositional variables (which
are retained as 0-place relation symbols), and quantifier ∀ over individuals. In
addition to that, we may or may not want to add an existence predicate E
and identity of individuals.
Definition 3 (The syntax of CrF O )
Let {P REDn }, for finite n, be a collection of sets P REDn , each containing
n-ary predicate symbols, with at least one P REDN non-empty; let V AR be
an infinite supply of individual variables x0 , x1 , ...., also written as x, y, z, ...;
and let E be the optional existence predicate.
Then the wffs of CrF O are defined as follows:
φ := q(x0 , ..., xn−1 ) | ∀xφ | > | ¬φ | φ ∧ φ | ♦φ | thenk (φ), where
q ∈ P REDn , and k ∈ N.
9 That only two variables are needed for the standard translation of ML was noted by
[Gabbay, 1981]. The case of two-variable logics is special. See, e.g., [Grädel and Otto, 1999]
on semantically two-variable logics and corresponding two-pebble games.
Expressive power of “now” and “then” operators
11
Optionally, Ex and xi = xj may be well-formed wffs.
There are many design options when it comes to the semantics of quantified modal logic (see [Fitting and Mendelsohn, 1998], [Blackburn et al., 2007,
Ch. 9]). For domain semantics, I choose varying domain semantics wherein
there are no restrictions on the relations of individual domains at different
points — the most general setting possible. For quantifiers, I use untensed
quantifiers ∀ (also called possibilist), which range over all individuals in the
model regardless of which points they exist at. In the special case when the
language has existence predicate E, we can also definetensed, or actualist,
quantifiers ∀tensed using untensed ∀ and E: ∀tensed xφ := ∀x(Ex → φ) (see
[Cresswell, 1991]). Of course, tensed quantifiers can also be defined as primitive.10
A first-order Kripke model with varying domains for CrF O is a structure
hW, R, {δw∈W }, {Vw∈W }i, where each δw is the individual domain of the point
w ∈ W , and Vw is a valuation relative to w, that is, a function from predicate
symbols in P REDn to n-ary relations over point w’s individual domain
δw .
S
For convenience, we also define the domain D of all individuals as w∈W δw .
Formulas of CrF O are evaluated in a first-order Kripke model M at a pointed
sequence hρ, ii relative to an assignment h of individuals to individual variables. For the interpretation of the modal component the presence of the assignment function h does not make a difference: we just pass it down. We call
two assignment functions h and h0 x-variants, h ∼x h0 , iff they agree on all
variables but x. Instead of Cr’s propositional variable clause M, hρ, ii |= q, we
have two clauses for predicates and for the universal quantifier. The defined
semantics is bivalent: any q is false of a tuple if it contains individuals not
existing at the current point. All omitted clauses are as for Cr.
Definition 4 (Varying domain semantics for CrF O )
M, h, hρ, ii |=CrF O q(x̄)
iff
hh(x1 ), ..., h(xn )i ∈ Vρ(i) (q)
M, h, hρ, ii |=CrF O ∀xφ
iff
∀h0 s.t. h0 ∼x h, we have M, h0 , hρ, ii |= φ
M, h, hρ, ii |=CrF O E(x)
M, h, hρ, ii |=CrF O xi = xj
iff
iff
h(x) ∈ δρ(i)
h(xi ) = h(xj )
As CrF O has both explicit variables over individuals and implicit thenvariables over points, we have two notions of sentencehood: a wff φ of CrF O
is a then-sentence iff, evaluated at hρ, 0i, it only depends on the point ρ(0).
(The notion of then-sentence is thus parallel to the notion of sentence for
Cr.) Furthermore, a CrF O then-sentence is a CrF O sentence iff it has no free
individual variables. We say that a then-sentence is true at ρ if hρ, 0i makes
10 Another design choice is whether to add any sort of counterpart theory, cf. [Lewis, 1968].
Counterpart theory is often used to identify individuals at different points when point domains are disjoint, but can be added to any other kind of domain semantics as well. I
will refrain from discussing counterpart theories altogether. See [Fara, 2008] and references
therein for combining a counterpart theory with ‘now’ and ‘actually’.
12
Igor Yanovich
it true. We may suppress M and h for brevity when that is safe to do. We will
O
FO
write CrF
.
sent for the then-sentence fragment of Cr
Extending the standard translation of Cr to a translation CrF O into a
two-sortal corresponding first-order language is straightforward.
Now the stage is set. In the next section, we will show that the sentence
fragment Crsent is expressively equivalent to ML by building an effective
truth-preserving translation. Then in Section 5, we will define bisimulations
for MLF O , and on the basis of that show that CrF O is genuinely more expressive than MLF O . Finally, in Section 6, we will show which place CrF O
occupies in the expressive hierarchy of hybrid languages. Put together, we will
have a theory of just how much expressive power adding “now” and “then” operators adds to a logic, and why that additional expressive power only arises in
quantified modal logic, and crucially depends on models with infinite domains
of individuals.
4 Eliminating then-operators in the propositional case
Special cases of eliminating backwards-looking operators in propositional modal
systems have been discussed in the literature, cf. [Kamp, 1971], [Meyer, 2009].
In this section, we provide a truth-preserving translation from the sentence
fragment Crsent of Cr into its underlying language ML (or, actually, two
such translations). The existence of such translations shows that when we add
“now” and “then” to modell natural language operators, in the propositional
case it does not actually increase the expressive power of modal logic. As long
as we do not have quantification over individuals, no increase in expressive
power occurs — unlike in the case when instead of “now” and “then”, we
introduce explicit quantification over worlds and times. (Of course, if we consider non-sentences of Cr, they are relative to more than a single point, and
standard ML cannot express such meanings. But such cases are not what
is usually taken to justify the linguistic and philosophical practice of using
explicit quantification over worlds and times.)
What “bound” thenk -operators in a Cr sentence do, is shift the evaluation
back to ρ(k) introduced by some higher ♦. We will provide two translations
allowing us to eliminate thenk by bringing its argument to be in the immediate
scope of the relevant ♦. One translation is local, and works by “floating” thenk
up one operator at a time until it reaches the level where we can eliminate
thenk . The other translation is global, introducing at the level of the “binder”
♦ two disjoined cases, one for when φ is true at ρ(k), another for when it is
false. Both translations are complex: the first one in the worst case involves
length increase exponential in the length of the translated sentence; the second
involves length blow-up exponential in the number of thenk φ subformulas. We
start with the local translation as it allows us to better illustrate the working
of the Cr system. The reader not interested in such illustrations may safely
skip to Thm. 1 and its second proof on p. 15.
Expressive power of “now” and “then” operators
13
For the first, local translation, we want to “float” each thenk ψ into the
immediate scope of the ♦ that introduces ρ(k) to which thenk ψ is to be
evaluated. Our first task then is to determine how we can transform Cr
formulas while preserving their truth. Unlike in standard modal logic, with
then-operators safety for substitution is determined relative to the evaluation
sequence and the syntactic context in which substitution occurs. Thus φ may
be safe to substitute for ψ in wff ξ1 , but not in wff ξ2 . For a simple example,
consider ♦ then1 p and then1 p. If both are evaluated at hρ, 0i, we can substitute then1 p with just p in the first formula, but not in the second. So we will
need to get a handle on such cases where substitution is OK.
Many substitutions, however, are always safe. It is easy to check that all
ML validities define valid substitutions in Cr: no matter the context, (p∨q)∧r
is always equivalent to (p∧r)∨(q∧r) in Cr. Similarly, the following equivalences
hold regardless of the context, as can be easily checked from the truth clauses
for Cr:
¬ thenk (φ) ⇔ thenk (¬φ)
thenk (φ ∧ ψ) ⇔ (thenk φ) ∧ (thenk ψ)
thenl (thenk (φ)) ⇔ thenk (φ)
(16)
(17)
(18)
But none of those allows us to “float” a thenk operator past a ♦. What
we need is to determine when the following semi-equivalence (∼) is valid:
♦(thenk (φ) ∧ ψ) ∼ thenk (φ) ∧ ♦ψ
(19)
In some cases, e.g. 20, a substitution that instantiates the schema in 19
results in an equivalent formula. But in other cases, e.g. 21, it does not. In
fact, the left formula in 21 is a sentence, while the right one is not: it depends
on ρ(2), not only on ρ(0).
♦♦(then0 (p) ∧ q) = ♦(then0 (p) ∧ ♦q)
♦(then1 (p) ∧ q) 6= then1 (p) ∧ ♦q
(20)
(21)
For our purposes, the following simple case where the schema in 19 works
will suffice:
Lemma 1 Let Cr sentence ξ contain an occurrence of ♦(thenk (φ)∧ψ), where
(1) φ contains no then operators, and (2) for index i at which ♦(thenk (φ)∧ψ)
would be interpreted in ξ, k 6= (i + 1). Let ξ 0 be the result of substituting that
occurrence with thenk (φ) ∧ ♦ψ. Then if ξ 0 is a sentence, it is equivalent to ξ.
Proof Suppose that ♦(thenk (φ) ∧ ψ) is true at hρ, ii. Then first, there is a
point v s.t. ρ(i)Rv and ψ is true at ρ0 ∼i+1 ρ where ρ0 (i + 1) = v. Second,
φ is true at ρ0 (k), and as it contains no then operators, its truth does not
depend on the rest of ρ0 . As k 6= (i + 1), φ is also true at ρ(k), making the first
conjunct of thenk (φ) ∧ ♦ψ true at hρ, ii. The existence of v makes the second
conjunct true as well.
In the other direction, the equivalence is as easy. a
14
Igor Yanovich
For the translation, we’ll need one more simple fact: when thenk φ is in a
sentence, and it would be evaluated at index k when the sentence is evaluated
at some hρ, 0i, then thenk can be safely eliminated: it simply shifts the interpretation index from k to itself. Now we are ready to prove the following by
building a local translation that floats each thenk up one step at a time until
it is eliminated:
Theorem 1 (Translation from Crsent into ML)
For each sentence ξ ∈ Cr, there is a ξ 0 ∈ ML such that M, hρ, 0i |=Cr ξ
iff M, ρ(0) |=ML ξ 0 .
Moreover, ξ 0 can be effectively computed from an arbitrary ξ ∈ Crsent .
Proof of Thm. 1 (local translation) We define a (local, bottom-up) translation from an arbitrary ξ to ξ 0 . When ξ is a sentence, all thenk operators return
evaluation to a point introduced by a higher ♦. If we float each thenk (φ) up
into the immediate scope of that ♦, we can then safely eliminate thenk .
We start with an arbitrarily chosen thenk that has no other thenl in its
scope, and float it one ♦ up. After each such step, we check if we can eliminate
the moved thenk because we reached the immediate scope of the ♦ that thenk
referred back to. If the check is positive, we eliminate that occurrence of thenk .
(Before the first step, of course, we need to check if we can eliminate any thenk
right away.)
Fix such a thenk φ where φ does not contain then-operators. That thenk φ
would be within subformula ♦(... thenk φ...) where the scope of ♦ is a nonmodal formula. (If there is no such ♦, then our k = 0, and we can eliminate
thenk right away.) If in (... thenk φ...), our thenk φ is embedded under another thenl , we apply the equivalences for negation and then-distribution, 16
and 17, to get to a configuration thenl thenk φ. At this point, we apply 18
to eliminate thenl . If there are more higher then operators, we repeat the
procedure until we transform ♦(... thenk φ...) so that there are no other thenl
between thenk and ♦.
Then we normalize the resulting non-modal formula into the disjunctive
normal form, treating all then-formulas as propositional variables. After that,
we apply modal equivalence ♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ, and obtain a subformula
where thenk φ may only be embedded under ¬ or ∧. For ¬, we apply 16.
Finally, embedding under ∧ does not prevent us from applying Lemma 1; in
fact, if thenk φ is not embedded under ∧, we need to apply the equivalence
ψ ⇔ (ψ ∧ >) to create a subformula that meets the conditions of Lemma 1.
Finally, we move thenk φ over ♦ using that lemma. If thenk may now be
eliminated, we do that, and otherwise we repeat.
Reapplying the same procedure, we can always select a then-formula to
be moved one ♦ up, so eventually we will be able to eliminate all of them,
obtaining an ML formula ξ 0 as desired. a
Here is how the translation works in one particular case:
♦♦p ∧ then0 (q ∨ ¬ then1 r)
Expressive power of “now” and “then” operators
15
♦♦p ∧ (then0 q ∨ then0 ¬ then1 r)
(17)&(16)
♦♦p ∧ (then0 q ∨ then0 then1 ¬r)
(16)
♦♦(p ∧ (then0 q ∨ then1 ¬r))
(18)
♦♦((p ∧ then0 q) ∨ (p ∧ then1 ¬r))
→ DNF
♦(♦(p ∧ then0 q) ∨ ♦(p ∧ then1 ¬r))
♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ
♦(♦(p ∧ then0 q) ∨ (then1 ¬r ∧ ♦p))
Lemma 1
♦((♦p ∧ then0 q) ∨ (¬r ∧ ♦p))
then1 elimination
♦(then0 q ∧ ♦p) ∨ (¬r ∧ ♦p))
Lemma 1
♦(then0 q ∧ ♦p) ∨ ♦(¬r ∧ ♦p))
((then0 q) ∧ ♦♦p) ∨ ♦(¬r ∧ ♦p))
(q ∧ ♦♦p) ∨ ♦(¬r ∧ ♦p))
♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ
Lemma 1
then0 elimination
The complexity of the translation is high because of the normalization step
which in the worst case leads to exponential blow-up of the formula length. If,
on the other hand, we apply a global translation to be defined below instead
of the local translation above, there would be guaranteed exponential formula
growth, but only in the number of thenk φ subformulas:
Proof of Thm. 1 (global translation) Consider Cr sentence ...♦(ψ(thenk φ))...,
where (1) φ is a formula of ML; (2) ψ(thenk φ) is a formula of Cr (thus possibly containing more then operators); and (3) the shown ♦ is the one to which
thenk φ refers back to.
When that sentence is evaluated, the ♦ introduces the point ρ(k). As φ
contains no then-operators, it is ρ(k) alone that determines whether thenk φ
will amount to > or ⊥; the rest of the sequence is irrelevant. We thus have
two cases, one where thenk φ = > and another where thenk φ = ⊥ relative to
ρ(k). In the first case, ψ(thenk φ) amounts to ψ(>), and in the second case,
it amounts to ψ(⊥). The original sentence is thus equivalent to the following:
...♦((φ ∧ ψ(>)) ∨ (¬φ ∧ ψ(⊥))).
As in the local translation case, for an arbitrary Cr sentence ξ, we can
repeat this procedure, each time selecting thenk φ with φ not containing other
then operators, and eventually we obtain ξ 0 ∈ ML. a
Here is an example of the global translation applied to the same Cr sentence as above (with some simplification steps added for readability):
♦♦p ∧ then0 (q ∨ ¬ then1 r)
then1 elimination
♦[(r ∧ ♦p ∧ then0 (q ∨ ¬>)) ∨ (¬r ∧ ♦p ∧ then0 (q ∨ ¬⊥)]
simplifying using propositional validities
♦[(r ∧ ♦p ∧ then0 q) ∨ (¬r ∧ ♦p)]
then0 elimination
16
Igor Yanovich
(q ∧ ♦[(r ∧ ♦p ∧ >) ∨ (¬r ∧ ♦p)]) ∨ (¬q ∧ ♦[(r ∧ ♦p ∧ ⊥) ∨ (¬r ∧ ♦p)])
simplifying using propositional validities
(q ∧ ♦♦p) ∨ (¬q ∧ ♦[¬r ∧ ♦p])
Each elimination of thenk φ leads to a roughly 2-fold increase of the substituted ψ(thenk φ), and we need as many such operations as there are distinct
thenk φ subformulas in ξ. The translation is thus exponential in the number
of then-operators. It is interesting if a less-than-exponential translation can
be given, but very simple translations are unlikely to exist.11
To sum up, as non-sentences of Cr contain implicit free variables over
points, it is trivial to find a Cr formula that cannot be expressed in ML. For
instance, p ∧ then1 (¬p) can distinguish a model that contains a p-point and
a non-p point not connected by the accessibility relation, while ML cannot
do that, as a simple bisimulation argument can show. But if we only consider
the fragment Crsent , where thenk only work as genuine backwards-looking
operators, then Thm. 1 shows that the extra operators do not increase the
range of meanings that the language can express.
In the next section we will see that once we move from Cr to CrF O , that
will change: adding backwards-looking operators to quantified (i.e. first-order)
modal logic leads to a genuine increase in expressivity even within the sentence
fragment.
5 Bisimulation for quantified modal logic
Since [Kamp, 1971], it is known that backwards-looking operators are not eliminable in quantified modal logic: Kamp presents a sentence of MLF O + now
that has no equivalent now-less sentence. Our task in this section is not to just
prove that CrF O is more expressive than MLF O (Kamp’s proof is sufficient
for that), but rather to pin down the exact amount of new expressivity which
thenk operators bring in when added to quantified modal logic.
We will use a standard tool that the modern modal logic uses for studying
expressivity of modal languages: bisimulations. A bisimulation corresponding
to a particular modal language L is a relation between the domains of two
L-models such that if two points are bisimilar, then they are indistinguishable by any formula of L. Bisimulation may be informally thought of as a relaxed version of isomorphism. Two isomorphic models cannot be distinguished
no matter what. Two bisimilar models, under a fixed notion of bisimulation,
cannot be distinguished by a particular logical language, though in a more
expressive language we may be able to tell them apart.
Thus with a suitable notion of bisimulation in hand, it becomes easy to
prove expressivity results. For instance, after we show that all bisimilar points,
11 See [ten Cate, 2005, Prop. 3.3.3], who shows that there is no polynomial normalization
for hybrid @-operators, close cousins of our then-operators (cf. Sect. 6 on the relation
between the two kinds).
Expressive power of “now” and “then” operators
17
under a fixed notion of bisimulation, are indistinguishable by language A, it
suffices to show that language B can distinguish some of such points to prove
that B is more expressive.
For a textbook-level review of bisimulations for propositional modal logic,
see [Blackburn et al., 2001, Ch. 2]. Defining appropriate notions of bisimulation for richer propositional modal languages has become a routine step
in modal-logical model-theoretic investigations (cf., e.g., [Areces et al., 2001],
[ten Cate, 2005], [Areces et al., 2011]). But what we need to pin down the
difference between MLF O and CrF O is a notion of bisimulation for a firstorder modal language, and to my knowledge, such a notion so far has not
been introduced in the literature. It will thus be worth spending some time
on how exactly we can arrive at the right notion. Consider standard propositional bisimulation first (and the reader interested in the new results may skip
directly to Def. 7 and 8, and Thm. 2):
Definition 5 (Bisimulation for ML)
A bisimulation E between two Kripke models M and N is a non-empty
relation in W M × W N with the following properties:
Propositional Harmony: If wEw0 , then for any propositional symbol p,
M, w |= p iff N, w0 |= p
Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 )
Zag: If wEw0 and ∃v 0 (w0 RN v 0 ), then ∃v(wRM v ∧ vEv 0 )
Points w ∈ M and v ∈ N are called bisimilar if there is a bisimulation
E such that wEv. Models M and N are called bisimilar if there exists a
bisimulation between them.
It is not hard to see why any two bisimilar points must be indistinguishable
in ML. Suppose that we need to find out whether we are at w ∈ M or at
v ∈ N , with w bisimilar to v, and that our only way of getting information is
by testing for truth ML formulas at our current point. If we check the truth
of propositional formulas, by Propositional Harmony and easy induction the
results will be the same at w and v, so that doesn’t help. Now suppose we are
actually at w, and we check if ♦φ is true. If it is, there is some accessible w0
where φ is true. But then by Zig, in N there is also an accessible v 0 bisimilar
to w0 where φ is true. By induction on φ, we will never find out whether we
are at w or at v. Thus ML is invariant under bisimulation. (Again, consider
[Blackburn et al., 2001, Ch. 2] for formal proofs.)
Bisimulation is much more relaxed than isomorphism. E.g., the following
models are bisimilar, though clearly not isomorphic:
Example 1 Bisimilar, but not isomorphic models
?>=<
89:;
w
M
?>=<
89:;
v1 g
'
N
?>=<
89:;
v2
18
Igor Yanovich
ML cannot distinguish M and N of Ex. 1, but first-order logic can: the
formula ∃u2 (u1 Ru2 ∧ u1 6= u2 ) is false at w and is true at v1 and v2 . So while
ML is invariant over bisimulations, its corresponding FO language is not. The
corresponding language is thus more expressive.
What should the notion of bisimulation appropriate for MLF O look like?
It is clear that Zig and Zag from the propositional case should be preserved.
It is also clear that instead of requiring Propositional Harmony, we need to
at the very least require “FOL harmony”: any bisimilar points should have
the same non-modal theories (that is, they should make true exactly the same
sets of formulas without modal operators). This leads us to the notion of
FOL bisimulation. As we will see shortly, this notion is not yet quite adequate,
but nevertheless it is useful as a first approximation:
Definition 6 (FOL bisimulation)
A FOL bisimulation E between two first-order Kripke models M and N is
a non-empty relation in W M × W N with the following properties:
FOL Harmony: If wEw0 , then for any φ ∈ FOL,
M, w |= φ iff N, w0 |= φ
Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 )
Zag: If wEw0 and ∃v 0 (w0 RN v 0 ), then ∃v(wRM v ∧ vEv 0 )
Note that φ may contain free variables. Thus for any tuple ā of individuals
at point w, at any bisimilar w0 there should be a corresponding tuple b̄ making
precisely the same non-modal formulas true. However, FOL bisimulation does
not ensure that such corresponding tuples would make the same sets of modal
formulas true.
Example 2 Mismatch of individuals
q:a
¬q : b
89:;
?>=<
w
/ 89:;
?>=<
v
M
q:a
q:c
¬q : b
¬q : d
GFED
@ABC
w0
/ ?>=<
89:;
v0
q:d
¬q : c
N
Consider relation E = {hw, w0 i, hv, v 0 i} between M and N from Ex. 2. At
all four points in the two models, the non-modal formulas ∃xq(x) and ∃x¬q(x)
are true, and it is easy to see that FOL harmony is satisfied. Furthermore, Zig
and Zag are also satisfied. Relation E is thus a FOL bisimulation. But the
MLF O formula ∃x(q(x) ∧ ♦q(x)) is true at w in M , but false at w0 in N .
FOL bisimulation ensures that the internal FOL-theories of bisimilar points
are the same, but it does not require that “harmony between individuals” holds
across points. That is why we could easily distinguish between M and N from
Ex. 2: we used the fact that for a at w, there is no corresponding a0 at w0
which would satisfy exactly the same modal formulas in one free individual
variable.
To define a proper notion of bisimulation for MLF O , we need to make
sure that for each tuple of individuals at a point, at a bisimilar point there
is a corresponding tuple which makes exactly the same MLF O formulas true.
Expressive power of “now” and “then” operators
19
It suffices that the correspondent make exactly the same non-modal formulas
true at each modal path:
Definition 7 (Modal paths)
A modal path is a finite string of diamonds from the language. For w1 ,
w2 points in model M , a non-empty path π = ♦i1 ...♦in leads from w1 to w2
(in symbols, w1 πw2 ) iff there exist points vi1 , ..., vvn−1 s.t. w1 Ri1 vi1 ∧ ... ∧
vin−1 Rin w2 . For the empty modal path Λ, by definition, ∀w : wΛw.
Definition 8 (FOL path bisimulation)
A FOL path bisimulation E between two first-order Kripke models M and
N is a non-empty relation in W M × W N with the following properties:
FOL path harmony: (i) If wEw0 , then for any finite tuple ā in DM ,
there is b̄ in DN such that for any modal path π, if ∃v ∈ W M to which path π
leads from w, then ∃v 0 ∈ W N such that w0 πv 0 and for any formula φ ∈ FOL,
M, v |= φ[ā] iff N, v 0 |= φ[b̄]. Similarly for any b̄ at w0 in N . We write ā ! b̄ for
such correspondent tuples. (ii) When ā ! b̄ at w and w0 , it must be possible
to extend those tuples to corresponding (ā, a1 ) ! (b̄, b1 ).
Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 ).
Zag: If wEw0 and ∃v 0 (w0 RN v 0 ) , then ∃v(wRM v ∧ vEv 0 ).
Returning to Ex. 2, we can see that w and w0 are FOL-bisimilar, but not
FOL-path-bisimilar. There is no individual at w0 that could be a FOL-pathharmony correspondent of a from w: c is no good because there is no point
accessible by the path ♦ where c satisfies q(x), while d does not satisfy q(x)
at the empty path Λ.
Theorem 2 If E is a FOL path bisimulation between M and N , and wEw0
for w ∈ M , w0 ∈ N , then for any φ of MLF O , there is a tuple ā such that
M, w |= φ[ā] iff there exists a tuple b such that N, w0 |= φ[b̄].
Proof Suppose towards contradiction that there exists φ such that there is ā
for which M, w |= φ[ā], but for all b̄, N, w0 6|= φ[b̄]. We fix some FOL-pathcorrespondent b̄ of ā, and thus have a pair of correspondents only one of which
makes φ true. The proof goes by gradually disassembling φ so that we can
finally derive a contradiction at the level of non-modal formulas. There are
the following cases: φ = ¬φ0 , φ = (φ0 ∧ φ00 ), φ = ∀xφ0 , and φ = ♦φ0 .
If φ = ¬φ0 , we have ā for which M, w 6|= φ0 [ā], but for its correspondent
b̄ that we fixed, N, w0 |= φ0 [b̄]. Exchanging the roles for ā and b̄, we can now
consider φ0 .
For φ = φ0 ∧ φ00 , we have M, w |= φ0 ∧ φ00 [ā], but N, w0 6|= φ0 ∧ φ00 [b̄] where
ā ! b̄. That means that either N, w0 6|= φ0 for a restriction b¯0 of b̄ to the
individuals substituted into φ0 , or similarly N, w0 6|= φ00 [b¯00 ].
— We can show that the restrictions ā0 and b¯0 must be correspondents just
as ā and b̄. Suppose that is not so, and ā0 6! b¯0 . Then by definition of FOL
path harmony, there are some π and (non-modal) ψ that M, w |= πψ[ā0 ], but
20
Igor Yanovich
N, w0 6|= πψ[b¯0 ]. Without loss of generality, let ā0 be the initial segment of ā,
and let there be n elements in the non-ā0 part of ā. Then we can build formula
ξ := ψ ∧ (p(x1 ) ∨ ¬p(x1 )) ∧ ... ∧ (p(xn ) ∨ ¬p(xn )). As we only added tautologies
to ψ, we have that M, w |= πξ[ā], but N, w0 6|= πξ[b̄]. But that is contrary to
assumption that ā ! b̄. Thus all restrictions of corresponding tuples are also
FOL-path-correspondents.
— Returning to φ0 ∧ φ00 , we note that either there are correspondent restrictions ā0 and b¯0 of ā and b̄ which disagree on φ0 , or similarly for φ00 . We
then consider φ0 and φ00 .
If φ = ∀xφ0 , we have M, w |= ∀xφ0 [ā], but N, w0 6|= ∀xφ0 [b̄]. We pick some
extension (b̄, b1 ) such that N, w0 6|= φ0 [(b̄, b1 )]. By clause (ii) of FOL path
harmony, we should be able to extend ā to some (ā, a1 ) that is correspondent
to (b̄, b1 ). As we have M, w |= φ0 [(ā, a1 )] for any a1 , we now consider φ0 , (ā, a1 ),
and (b̄, b1 ).
When φ = ♦φ0 , we move to φ0 and R-accessible v ∈ M and v 0 ∈ N thanks
to the Zig and Zag conditions.
And finally, when φ is non-modal, and we have M, w |= φ[ā], but N, w0 6|=
φ[b̄], that directly contradicts FOL path harmony given that ā ! b̄. a
It follows from Thm. 2 that when two points are FOL-path-bisimilar, then
they are indistinguishable in MLF O .
Thm. 2’s converse does not hold in the general form: as is well-known, the
converse fails for propositional ML, and that result carries over to MLF O .
Thus there can be MLF O -models that are indistinguishable in the language,
but nevertheless not FOL-path-bisimilar.12
Note that whether two models are bisimilar depends on the particular
language used. E.g., M and N from Ex. 3 are FOL-path-bisimilar if identity
is not in the language, and are not FOL-path-bisimilar if identity is included.
12 However, we can provide an analogue of the Hennessy-Milner theorem that states that
the converse holds for a particular class of models. For ML, that is the class of image-finite
models: those where every point has only a finite number of R-successors for each R. The
propositional proof shows that the relation of modal equivalence is itself a bisimulation in
this case. The condition of image-finiteness allows the following argument to come through:
suppose that w and w0 are bisimilar, but for some v : wRv, there is no bisimilar v 0 : w0 Rv 0 .
0 , there is φ s.t. v 0 |= φ , but u0 6|= φ . As the set of all u0 is finite,
Then for each u0i : w0 RuV
i
i
i
i
i
i
we can build formula ♦ i φi , which is true at w thanks to the existence of v, but is false at
0
w . This is contrary to assumption.
In the case of MLF O , we need not only the assumption of finiteness for successor points, but
also for domains of individuals. The argument for individuals would be along the following
lines. Suppose that there are indistinguishable w and w0 where ā at w has no correspondent
b̄ at w0 . Then we collect all pairs of π and φ that witness that a particular b̄ does not
correspond to ā, and V
as there is only a finite number of distinct b̄s, we can collect them into
one large formula ∃x̄ i πi φi (x̄). At w, tuple ā ensures that this formula is true, but at w0
by construction there is no b̄ that would witness that. But then w and w0 have different
MLF O theories, contrary to assumption. When the number of distinct tuples is not finite,
we cannot gather all π and φ into a single formula, hence the converse to Thm. 2 would not
hold in such a case.
Expressive power of “now” and “then” operators
21
Example 3 FOL-path-bisimilar models distinguishable by CrF O
89:; q : a, b, c
?>=<
r9 u ¬q : d
r
r
r
rrr
rrr
r
r
rr
q : a, b, c 89:;
?>=<
w L
LLL
¬q : d
LLL
LLL
LLL
L%
q : a, b
89:;
?>=<
v
¬q : c, d
q : a0 , c0
@ABC
GFED
u0
8
q
¬q : b0 , d0
q
qqq
q
q
q
qqq
qqq
q : a0 , b0 , c0 GFED
@ABC
0
w MM
¬q : d0
MMM
MMM
MMM
MMM
& ?>=<
q : a0 , b0
89:;
v0
¬q : c0 , d0
M
N
When there is no identity in the language, we have individuals of just
three kinds at both w and w0 , and it’s easy to check that M, w and N, w0 are
FOL-path-bisimilar:
a, b;
c;
d;
a0 :
b0 , c 0 :
d0 :
q(x) ∧ ♦q(x) ∧ ¬♦¬q(x)
q(x) ∧ ♦q(x) ∧ ♦¬q(x)
¬q(x) ∧ ♦¬q(x) ∧ ¬♦q(x)
But if identity were in the language, then, e.g., a and a0 would not have
been FOL-path-harmony correspondents: formula ∃y∃z(x 6= y ∧ x 6= z ∧ y 6=
z ∧ ♦(q(x) ∧ q(y) ∧ q(z))) is made true by a, but not by a0 .
Assuming a language without identity, FOL-path-bisimilar w and w0 from
Ex. 3 cannot be distinguished by MLF O by Thm. 2. From the fact that CrF O
sentence 22 is true at w and false at w0 , we immediately derive the expressivity
result in Prop. 2.
♦∀x(now(q(x)) → q(x))
(22)
O
Proposition 2 MLF O ( CrF
sent .
O
FO
O
Proof For MLF O ⊂ CrF
formula is in CrF
sent , every ML
sent .
As w and w0 in Ex. 3 are FOL-path-bisimilar, and 22 is true at w, but not
O
at w0 , by Thm. 2 we have MLF O 6= CrF
sent . a
Thus propositional Crsent is as expressive as ML (Thm. 1), but quantified
FO
O
CrF
(Prop. 2). We will now connect
sent is strictly more expressive than ML
those two results, showing why the addition of backwards-looking operators
leads to greater expressivity over points only through greater expressivity over
individuals.
We say that tuple ā at w in M has property φ ∈ CrF O iff for some assignment h, M, h, w |= φ[ā]. Formula φ defines the set of tuples of individuals
which have property φ. Similarly we can talk about properties of points, tuples
of points, and of ordered pairs of a tuple of points and a tuple of individuals.
As all formulas of MLF O are also formulas of CrF O , all properties expressible in MLF O are trivially expressible in CrF O . But in addition to those,
22
Igor Yanovich
CrF O may also express properties of tuples of individuals relative to more
than one point. For instance, the subformula now(q(x)) → q(x) of 22 defines
a two-point property of individuals that are either ¬q(x) at the now-point,
or q(x) at the current point. Using that property, we can distinguish w and
w0 from Ex. 3. If we start with w as the now-point, it is possible to choose a
♦-accessible point so that a set of individuals defined by now(q(x)) → q(x) is
equal to the whole domain of individuals. But if we start with w0 in N as the
now-point, there is no choice of a ♦-successor which would allow that. That
is precisely why CrF O may distinguish between w and w0 with the formula
22.13
In fact, we can show that only such thenk φ may increase expressivity
where φ contains free individual variables. That is demonstrated by Prop. 3,
an easy generalization of Thm. 1.
O
Proposition 3 For ξ ∈ CrF
sent , if every thenk (φ) in ξ has no free individual
variables, then ξ has an equivalent MLF O formula.
Proof We adapt the global translation eliminating then-operators used in the
second proof of Thm. 1 on p. 4.
Recall that in the global then-eliminating translation, we exploited the
fact that when ♦ to which thenk φ with φ ∈ ML refers back to, introduces
point ρ(k), the truth of thenk φ depends only on that ρ(k). Given that fact, we
were able to translate ...♦(ψ(thenk φ))... into ...♦[(φ ∧ ψ(>)) ∨ (¬φ ∧ ψ(⊥))]....
To adapt that translation to our case here, we only note that when φ ∈
MLF O is a closed formula, the truth of thenk φ only depends on ρ(k), just as
in the propositional case. We can therefore apply the same procedure. a
It is thus no coincidence that sentence 22 which we used to prove that
FO
O
, included subformula now(q(x)) with
CrF
sent is more expressive than ML
a free individual variable. By Prop. 3, no sentence of CrF O without such a
subformula can express a meaning not expressible by MLF O .
O
FO
Though CrF
, it is easy to see that for
sent is more expressive than ML
many pairs of models adding then-operators to the language does not allow us
to distinguish models indistinguishable in MLF O . For instance, in Ex. 4 K is
O
not distinguishable from L in either MLF O or CrF
sent without identity, and K
and M are not distinguishable by either even if identity is in the language. (Of
course, if we consider non-sentences of Cr, there would be formulas satisfiable
in M , but not in K.)
Example 4
89:;
q : a ?>=<
w
?>=<
q : a1 , a2 89:;
u
K
L
?>=<
q : a3 89:;
v
?>=<
89:;
v 0 q : a4
M
13 This informal characterization of the difference which introducing then-operators makes
is similar to the one given by [Meyer, 2009].
Expressive power of “now” and “then” operators
23
I finish this section with a result underscoring the fact that the extra expressive power of CrF O is a relatively mild addition to MLF O : it turns out
that for languages with identity, then can make a difference only in models
with infinite domains.
Proposition 4 If the basic language has identity, and MLF O model M has a
O
FO
0
finite individual domain, then for any ξ ∈ CrF
such
sent there exists ξ ∈ ML
0
that for any ρ and h, M, h, hρ, 0i |=CrF O ξ iff M, h, ρ(0) |=MLF O ξ . (Note that
ξ 0 is relative to a specific M .)
Proof The proof is a modification of the global translation of Thm. 1 and
Prop. 3.
Suppose we fixed thenk φ(x) to be eliminated, where φ does not contain
other then operators, and there is one free individual variable x. (We consider
cases with more variables below.) When we consider ...♦ψ(thenk φ(x))... where
♦ is the one which thenk refers back to, there are two possibilities: either x
within φ remains free in ψ(thenk φ(x)), or it is bound within ψ.
If x remains free, h(x) is not altered as we go down the formula from ψ to
φ, and therefore we simply apply the global translation step as in Prop. 3.
The interesting case is when x gets bound within ψ, namely when we
are dealing with ψ = (...∀x(... thenk φ(x))...). At the hρ, ki at which ψ gets
evaluated, the formula φ(x) will be true for some x and false for others. When
we use thenk , we can “refer back” to those truth values for φ(x) at hρ, ki. To
get rid of thenk , we “record” the individuals that make φ true at hρ, ki at the
top level of ψ, and “export” variables referring to them for further use down
in ψ. Then for the quantifier ∀x, instead of quantifying directly into φ(x), we
provide two cases: one where the relevant individual is one of those that made
φ true at hρ, ki, and the other where it didn’t. Here is how we do this:
Let the number of individuals in model M be n. We define an abbreviation
∃φ,m,x as standing for the following: ∃x0 ...∃xm−1 ((x0 6= x1 ∧ ... ∧ xm−2 6=
xm−1 ) ∧ (φ(x0 ) ∧ ... ∧ φ(xm−1 )) ∧ (¬∃xm (xm 6= x0 ∧ ... ∧ xm 6= xm−1 ∧ φ(xm )).
In words, ∃φ,m,x records that there are exactly m distinct individuals that
make φ true at the current index, and exports those individuals for future use
in variables x0 , ..., xm−1 .
Now we translate our ...♦(...∀x : ... thenk φ(x)...)... as follows:
...♦(
_
0≤i≤n
∃φ,i,x [...∀x : (
_
0≤j≤i
x = xj ) → ...>...) ∧ (
^
x 6= xj ) → ...⊥...)])...
0≤j≤i
The quantifier ∀x in our formula still checks the truth of its scope for each
x. If x is equal to one of the xi s which made φ true at hρ, ki, we substitute >
instead of φ(x). If x does not belong to that group, we substitute ⊥. In each
case, we get exactly what we would have got if we evaluated thenk φ(x) in its
original place, but without thenk . However, we can only do that if we have
a finite, and known, number of individuals in the model: otherwise the huge
disjunction that we need to build would not be finite either.
24
Igor Yanovich
For thenk φ with multiple free variables bound within ψ, the disjunctions get even more complex, as we need to “store” in new variables not
just single individuals, but tuples that make φ true. I illustrate for the 2variable case. Let our ψ be (...∀x...∀y... thenk φ(x, y)...). We define an abbreviation ∃φ,hm,l0 ,...,l(mx −1) i,x,y . Number m records the number of x for which
there is some y with which they make φ true. Numbers l0 ...lmx −1 record
for each such x the exact number of ys that make φ true in a pair with
that x. The abbreviated operator then is defined as follows: ∃x0 ∃y0,0 ...∃y0,l0 :
...∃xm−1 ∃y(m−1),0 ...∃y(m−1),l(m−1) : (x0 6= x1 ...) ∧ (y0,0 6= y0,1 ...y0,(l0 −2) 6=
y0,(l0 −1) ) ∧ ... ∧ (y(m−1),0 6= y(m−1),1 ...y(m−1),(l(m−1) −2) 6= y(m−1),l(m−1) −1 ) ∧
[φ(x0 , y0,0 ) ∧ ... ∧ φ(x0 , y0,(l0 −1) ) ∧ ¬∃y0,l0 : y0,l0 6= y0,0 ∧ ... ∧ y0,l0 6= y0,(l0 −1) ∧
φ(x0 , y0,l0 )] ∧ ... ∧ ¬∃xm : (xm 6= x0 ...) ∧ ∃z : φ(xm , z). What this operator does
is record all and only pairs that make φ true as hx0 , y0,0 i...hx0 , y0,(l0 −1) i and
so forth. We translate ...♦(...∀x : ...∀y... thenk φ(x, y)...)... as follows:
...♦(
W
0≤i,i0 ,...,i(i−1) ≤n
∃φ,hi,i0 ,...,i(i−1) i,x,y :
V
0≤j≤i (x = xj →
W
V
(...∀y : (( 0≤k≤ij y = yj,k ) → ...>...) ∧ ( 0≤k≤ij y 6= yj,k ) → ...⊥...) )
V
∧ ( 0≤j≤i (x 6= xj ) → ...∀y : (...⊥...) ) ) )...
( ...∀x :
It should be clear how to modify the translation step for any particular
number of variables in φ bound from within ψ. The resulting formula will be
a daunting but truth-preserving substitute for the original ψ(... thenk φ...). a
From Prop. 4 we derive a simple corollary that shows that then operators
only change the expressivity (of the sentence fragment of the language) for
models with infinite domains:
Corollary 1 If the language has identity, then finite FOL-path-bisimilar modO
els M and N cannot be distinguished by any ξ ∈ CrF
sent .
Proof Suppose there is such ξ which is true at M, w, but false at N, w0 with
w bisimilar to w0 . Let n be the cardinality of the greater of DM and DN . By
Prop. 4, we can build ξ 0 ∈ MLF O equivalent to ξ in M and N . But from
Thm. 2, there can be no such ξ 0 . a
What about infinite models and CrF O with identity? Ex. 5 shows that
O
FO
with an infinite number of individuals, CrF
.
sent is more expressive than ML
Example 5 FOL-path-bisimilar models for MLF O with identity that
O
can be distinguished by CrF
sent
Expressive power of “now” and “then” operators
q : a0 , a1 , ...
q : a0 , a1 , ...
¬q : b0 , b1 , ...
¬q : b0 , b1 , ...
?>=<
89:;
w
/ 89:;
?>=<
u
25
q : c0 , ...
0
@ABC
GFED
8 u0 ¬q : d , ..., e , ...
r
r
0
0
rrr
q : c0 , ..., d0 , ...
rrr
r
r
r
¬q : e0 , ...
rrr
@ABC
GFED
w0 LL
LLL
LLL
LLL
LLL
& 0
q : d0 , ...
@ABC
GFED
u1
¬q : c0 , ..., e0 , ...
M
N
At both w and u in M , there is an infinite number of a-s being q, and of b-s
being ¬q. Thus all individuals that are q at w are also q at u. At w0 , there are
three infinite sets of individuals: c-s, d-s and e-s. All c-s and d-s make ♦q(x)
true, but there is no point accessible from w0 where both c-s and d-s are q
at the same time. But as the extensions of properties q and ¬q at u00 and u01
are infinite, we cannot register the difference between u and u00 or u01 using
MLF O : those points are FOL-path-bisimilar. At the same time, the familiar
CrF O formula 22, namely ♦∀x(now(q(x)) → q(x)), is true at w, but false at
FOL-path-bisimilar w0 .
6 Cr languages and hybrid languages
The Cr languages with then operators that we introduced are close cousins
to hybrid languages that have received considerable attention in the literature
since the early 1990s. Basic hybrid language HL is the basic modal language
enriched with nominals: propositional variables of a special sort that may
only be true at a single point in any model. Nominals i, j, ... can be used as
terms, just as propositional variables do, or they can be bound by different
hybrid operators. We provide below the syntax and semantics for language
HL(@, ↓). Languages HL(@) and HL(↓) feature only one of the hybrid operators defined below. In addition, we will use ML + @ + ↓ to refer to the
language that is like HL(@, ↓) except that nominals do not occur as atoms in
its formulas. For an introduction into these and other hybrid languages, see
[Blackburn and Seligman, 1995], a.o.
Definition 9 (The syntax of HL(@, ↓))
For P ROP a set of propositional variables, and N OM a set of nominal
variables, and i ∈ N OM , the wffs of HL(@, ↓) are:
φ := P ROP | N OM | > | ¬φ | φ ∧ ψ | ♦φ | @i.φ | ↓i.φ
Formulas of HL(@, ↓) are evaluated in a Kripke model M at a point w relative to an assignment g of points to nominal variables. The nominal assignment
function g may be viewed as a storage device for references to points: ↓i stores
the current point as the value of variable i; @i retrieves the value recorded in
i from the storage to evaluate the argument formula at that value. An atomic
occurrence of i tests whether the current point is the one stored in i. We say
26
Igor Yanovich
that ↓i binds the occurrences of @i and i in its scope, and that non-bound
occurrences are free. An HL(@, ↓) formula is a sentence iff its truth does not
depend on g, which happens exactly when there are no free occurrences of @i
or i.
Definition 10 (The semantics of HL(@, ↓))
As usual, g ∼i g 0 iff for any j 6= i we have g(j) = g 0 (j).
M, g, w
M, g, w
M, g, w
M, g, w
M, g, w
M, g, w
M, g, w
M, g, w
|= q
|= i
|= >
|= ¬φ
|= φ ∧ ψ
|= ♦φ
|= ↓i.φ
|= @i.φ
iff w ∈ V (q)
iff w = g(i)
always
iff it is not the case that M, g, w |= φ
iff M, g, w |= φ and M, g, w |= ψ
iff there is w0 s.t. wRw0 and M, g, w0 |= φ
iff for g 0 s.t. g 0 ∼i g and g 0 (i) = w, M, g 0 , w |= φ
iff M, g, g(i) |= φ
It is easy to define polynomial truth-preserving translations between Cr
and ML + @ + ↓. From Cr to ML + @ + ↓, we only need to add ↓ik in the
scope of each ♦, and replace thenk with @ik referring back to the appropriate
↓. For example:
♦♦(p ∧ then1 q)
7−→
↓i0 .♦ ↓i1 .♦(p ∧ @i1 .q)
In the other direction, we can replace every @i with theni referring back
to the ♦ in whose immediate scope the binder ↓i occurred. For example:
♦↓i.(♦p) ∨ ♦(q ∧ @i.q)
7−→
♦((♦p) ∨ ♦(q ∧ then1 q))
Thus Cr and ML + @ + ↓ are essentially notational variants.
What distinguishes all full-fledged HL languages from Cr are atomic occurrences of nominals. We will now briefly discuss how the expressive power
of HL(@), HL(↓) and HL(@, ↓) relates to that of Cr.
First, Cr is clearly not more expressive than any HL language: Cr has
no formulas that have to be true at exactly one point in any model. So Cr
may be either strictly weaker or expressively incompatible with any of the
HL languages. Moreover, as the sentence fragment of Cr is equivalent to
ML, which is the underlying language for all HL languages, Crsent is strictly
less expressive than any full HL language. At the same time, the sentence
fragments of HL(@) and HL(↓) also collapse to ML.
HL(↓) and Cr = ML + @ + ↓ are mutually expressively incomparable.
HL(↓) can store points; it can also test, with nominals used as atoms, whether
we are at a point that we previously stored. But HL(↓) cannot return the
evaluation to a previously stored point. As the result, it cannot express a
formula like p ∧ then1 q.
As for HL(@), it is strictly more expressive than Cr. All free thenk operators may be replaced by corresponding @ik operators, but in addition to
those, HL(@) has atomic nominals.
Expressive power of “now” and “then” operators
27
HL(@, ↓) is more expressive than HL(@), and therefore than Cr. Moreover, even the sentence fragment of HL(@, ↓) is more expressive than the
sentence fragment Crsent : sentence ↓i.¬i of HL(@, ↓) defines R-irreflexivity
of the current point, and as any sentence of Crsent is equivalent to a formula of
ML, and no ML formula can define R-irreflexivity, the sentence fragment of
HL(@, ↓) is strictly more expressive than Crsent . (Cf. bisimilar, and therefore
ML-indistinguishable, models from Ex. 1, only one of which has irreflexive
points.)
So what is the place of Cr in the family of modal/hybrid languages? As
[Areces et al., 2001] show, HL(@, ↓) is an important system, being the logic
for meanings invariant under taking generated submodels: any FOL formula
in that class is equivalent to a formula of HL(@, ↓). The FOL fragment corresponding to HL(@, ↓) is the bounded fragment, where the domain of quantification over points is restricted to R-accessible points.
Our simple analysis above shows that Cr is one of the less expressive
inhabitants of the group of modal/hybrid languages invariant over generated
submodels. Figure 6 is the “expressivity map” of that group. In the diagram,
A → B means that A is strictly less expressive than B.
n
nnn
nnn
n
n
n
nv nn
Cr = ML + @ + ↓
ML
77
77
77
77
77
77
77
77
7
HL(@)
EE
EE
EE
EE
EE
EE
EE
EE
E"
HL(↓)
HL(@, ↓)
Fig. 1 Cr and its hybrid cousins.
And if we look only at the sentence fragments of the relevant languages
(i.e. without unbound then, @i and i), Crsent , HL(@)sent and HL(↓)sent all
collapse to the regular ML, while HL(@, ↓)sent remains more expressive.
7 Philosophical and linguistic consequences
Let me sum up the main formal results we derived above:
– When added to propositional modal logic, now and then operators that
are genuinely backwards-looking do not add any extra expressive power.14
14 When not genuinely backwards-looking, those operators essentially act as unbound hybrid @i operators.
28
Igor Yanovich
– When added to first-order modal logic, now and then do bring in extra
expressivity.
– However, if we have identity of individuals in the logical language, the extra
expressivity only manifests itself in infinite models.
– Finally, the amount of extra expressivity is tiny: basic modal logic with
backwards-looking operators is a particularly mild case of a hybrid logic
invariant over generated submodels. Importantly, even the most expressive
such logic, namely HL(@, ↓), is vastly less expressive than full many-sortal
FOL with explicit quantification over worlds and times.
If the actual amount of expressivity added by now and then is so tiny, then
why do so many linguists and philosophers assume that such operators amount
to explicit quantification over worlds? This widespread erroneous assumption
is due to a misreading of [Cresswell, 1990], which we will now discuss.
Here is how the system of [Cresswell, 1990] is set up. Its formulas are evaluated at sequences of points. The initial member of the sequence is always
used as the current point, and operators Refn and thenk respectively store
and retrieve the current point from the k-th member of the sequence. Cresswell’s Refk is thus essentially hybrid ↓ik (which we introduces in the previous
section), and Cresswell’s thenk is @ik . So if we add Cresswell’s operators to
the basic modal language, we get ML + ↓ + @, which, as we have shown above,
is a notational variant of our Cr.
Cresswell famously argues that when we add “now”, “then” and “actually”
operators to the language, that amounts to introducing “explicit quantification” over worlds and times. By that, he means that his system with Refk and
thenk is as expressive as the corresponding many-sortal FOL language. But we
have just seen that ML+↓+@ and Cr are even less expressive than HL(@, ↓),
which itself is equivalent to the bounded fragment of the correspondence FOL
language, not the whole thing! How come? Is our expressivity analysis wrong,
or is Cresswell’s?
In fact, both analyses are correct. They are not contradictory. The secret
is in the underlying language. We added then operators to the very basic
underlying language, namely ML. In constrast to that, Cresswell uses a richer
basic language: he includes into it universal modality A, which he writes as .
He defines φ to be true when φ is true at every point in the model, without
any regard for R-accessibility (see his (15) on p. 8.) In addition to universal
modality, Cresswell also uses operator L for the familiar, R-restricted of
ML.
But universal modality is a very powerful operator in the hybrid family, as
discussed, a.o., by [Goranko and Passy, 1992], [Blackburn and Seligman, 1995],
[Blackburn and Seligman, 1998]. Having universal modality A together with ↓
and @ (in other words, Cresswell’s Ref and then), we can easily define unrestricted quantification over points, which indeed brings us all the way up to
the full expressivity of FOL: ∀wφ can be defined as ↓j.A(↓i.@j.φ). But importantly, “now” and “then” operators alone are not capable of that. It is the
Expressive power of “now” and “then” operators
29
universal modality A that is essential for making the language not invariant
under generated submodels.
Cresswell himself does not stress that his claim is only valid for a language
with A. Unfortunately, that led to serious misinterpretation of his result (possibly because later researchers did not realize that Cresswell’s stands not
for the usual modal , and not for A).15
To give a typical example from the philosophical literature, [Recanati, 2007,
pp. 61-62] writes: “It has been established that the full expressive power of firstorder logic is needed to deal with natural-language tenses. (See e.g. van Benthem 1977; Cresswell 1990: chs. 2-4.)”16 It is striking that Recanati’s remark
is made within a discussion of whether to use an ML-based or FOL-based system for translating natural language. Misinterpreting Cresswell’s claim leads
to the adoption of a wrong assumption crucial for Recanati’s argument. That
argument undoubtedly would have looked differently had it been clear that
simply adding “now” and “then” does not automatically result in full FOL
expressivity.
In the linguistic literature, the misinterpretation of Cresswell’s claim also
led to widespread adoption of the belief that the full power of FOL is absolutely
required to account for natural language modal operators and tenses. This is
all the more striking given that linguists rarely assume that natural language
has true unrestricted quantification, without which Cresswell’s result becomes
inapplicable. For instance, [Schlenker, 2003] writes (p. 99) that “the full power
of quantification over times and worlds is needed to analyze temporal and
15 [Meyer, 2009] is another philosophical take on the problem of expressivity of “now”
and “then” operators. Meyer (p. 229) aims to show “that now and then are eliminable
in quantified tense logic, provided we endow it with enough quantificational structure.”
What he means by “enough quantificational structure” is including into the language a set
membership predicate and existential quantification over sets. With that much, we can easily
express Russell’s paradox: ∃s(s 6∈ s).
Even if one does not worry about paradoxes, adding quantification over sets we become able
to express second-order properties (e.g., distinguish the standard model of Peano arithmetic
from non-standard models), thus going a long way beyond the expressive resources of FOL.
Using set theory, Meyer can indeed eliminate “now” and “then” operators. One wonders,
however, if such a solution “deals with the problem in the most natural way”, as Meyer puts
it (p. 242). As we have seen, “now” and “then” operators actually make the basic modal
language only slightly more expressive, so it is hardly surprising that by going as far as
second-order expressivity we can become able to eliminate them.
Both Cresswell and Meyer aim to trivialize the contribution of “now” and “then” operators:
Cresswell argues that such operators increase the expressive power of the language up to
that of FOL, and Meyer argues that in a language that can express second-order properties,
“now” and “then” are redundant anyway. The framework we developed in this paper allows
us to understand the actual contribution of such operators without trying to fully reduce
them to some familiar, and vastly more expressive, system. Our model-theoretic analysis
allows us to distinguish cases where the addition of then makes a difference from cases
where it doesn’t.
16 The actual position of [van Benthem, 1977] is in fact nothing of the sort. As van Benthem
himself puts it (p. 436): “From a technical point of view, tense logics could be considered
to be sublogics of predicate logic. <...> But, as tense logics become stronger and stronger
(containing ever more exotic operators), predicate logic itself becomes a serious rival as
regards elegance and simplicity” (emphasis mine).
30
Igor Yanovich
modal talk in English”, citing [Cresswell, 1990]. At the same time, on the very
next pages Schlenker notes (pp. 100-101) that it is hardly possible to find
natural language expressions that would express unrestricted quantification.
From this brief discussion, it should be clear how much damage the misinterpretation of Cresswell’s claim has made within both philosophy of language
and formal semantics. It should be stressed that the particular examples of
[Recanati, 2007] and [Schlenker, 2003] have been chosen not as the worst, but
as some of the best and most explicit attempts to reach a better understanding
of the modal and temporal expressivity of natural language. The misconstrued
version of Cresswell’s claim is a part of the two fields’ folklore by now, and
should not be attributed to any particular author personally. I hope that this
paper would make it easier to finally correct that mistake, and replace folklore
with solid logical arguments.
To sum up, though Cresswell’s claim is valid for the particular very expressive language he considers, it is invalid in the general case. Until one proves
that natural language requires unrestricted quantification over times or worlds,
one should not appeal to Cresswell’s claim. Moreover, if we consider more restricted underlying languages than Cresswell’s, the amount of expressive power
added by “now”, “then” and “actually” turns out to be pretty mild, as we saw
in Sections 5 and 6.
What are then the practical consequences of correcting this misreading of
Cresswell for a linguist or philosopher of language? Contrary to the widespread
beliefs, systems with backwards-looking operators turn out to be genuinely less
expressive, and therefore more restrictive and predictive than systems with
explicit quantification over wolrds and times. That in turn means that unless
we find new arguments for adopting explicit-quantification systems, we might
want to try to live within the means of operator-based ones. That, of course,
is not to say that operator-based systems are inherently better: after all, their
expressive power ultimately depends on the kind of operators they feature,
and there are indeed sets of modal operators that make the system as expressive as many-sortal FOL (e.g., Cresswell’s system with universal modality
and what we would have now called hybrid ↓ and @ is such an example.) But
systems which only feature backwards-looking operators are in fact very mild
expressively.
Thus when we would be choosing between the analysis for natural language
sentences as in 23 on the one hand and 24 on the other (repeated here from 8
and 12), expressive power should be taken into account.
[[everyone now alive will be dead]] = F (∀x : now(alive(x)) → dead(x))
(23)
[[everyone now alive will be dead]] = ∃t1 t0 (∀x : (alive(x)(t0 ) → dead(x)(t1 ))(24)
Similarly, when we are discussing whether there exist syntactically represented covert variables over times and worlds, we should consider not just
the fact that there is currently no syntactic evidence for their existence, but
also that on top of that the same interpretational work may be performed by
backwards-looking operators. And unlike the explicit-quantification systems,
Expressive power of “now” and “then” operators
31
operator-based ones may be defined very restrictively from the start, closely
matching the actual expressivity of natural language that has been observed.
Thus expressive power should be paid attention to not as a matter of
some axiom, but simply because when a formal language is less expressive,
it is more predictive. And as good practice within linguistics and philosophy
dictates, other things being equal, more predictive and restrictive systems are
to be preferred. The contribution of the current paper to the debate is then
that we have been able to pin down the logically exact amount to which a
system with now and then is more restrictive and predictive than systems
with explicit quantification over worlds and times.
References
[Areces et al., 2001] Areces, C., Blackburn, P., and Marx, M. (2001). Hybrid logics: Characterization, interpolation and complexity. The Journal of Symbolic Logic, 66(3):977–1010.
[Areces et al., 2011] Areces, C., Figueira, D., Figueira, S., and Mera, S. (2011). The expressive power of memory logics. Review of Symbolic Logic, 4(2):290–318.
[Blackburn et al., 2001] Blackburn, P., de Rijke, M., and Venema, Y. (2001). Modal Logic,
volume 53 of Cambridge Tracts in Theoretical Computer Science. Cambridge University
Press.
[Blackburn and Seligman, 1995] Blackburn, P. and Seligman, J. (1995). Hybrid languages.
Journal of Logic, Language and Information, 4:251–272.
[Blackburn and Seligman, 1998] Blackburn, P. and Seligman, J. (1998). What are hybrid
languages? In Kracht, M., de Rijke, M., Wansing, H., and Zakharyaschev, M., editors,
Advances in Modal Logic, volume 1, pages 41–62. CSLI Publications, Stanford.
[Blackburn and van Benthem, 2007] Blackburn, P. and van Benthem, J. (2007). Modal
logic: a semantic perspective. In [Blackburn et al., 2007], chapter 1. Elsevier.
[Blackburn et al., 2007] Blackburn, P., van Benthem, J. F., and Wolter, F., editors (2007).
Handbook of modal logic, volume 3 of Studies in logic and practical reasoning. Elsevier.
[Cresswell, 1990] Cresswell, M. (1990). Entities and Indices. Kluwer, Dordrecht.
[Cresswell, 1991] Cresswell, M. (1991). In defense of the barcan formula. Logique et Analyse,
135-136:271–282.
[Fara, 2008] Fara, D. G. (2008). Relative-sameness counterpart theory. The Review of
Symbolic Logic, 1(2):167–189.
[Fitting and Mendelsohn, 1998] Fitting, M. and Mendelsohn, R. L. (1998). First-order
modal logic, volume 277 of Synthese library. Kluwer, Dordrecht.
[Gabbay, 1981] Gabbay, D. M. (1981). An irreflexivity lemma with applications to axiomatizations of conditions on linear frames. In Mönnich, U., editor, Aspects of Philosophical
Logic, pages 67–89. Reidel, Dordrecht.
[Goranko and Passy, 1992] Goranko, V. and Passy, S. (1992). Using the universal modality:
Gains and questions. Journal of Logic and Computation, 2(1):5–30.
[Grädel and Otto, 1999] Grädel, E. and Otto, M. (1999). On logics with two variables.
Theoretical Computer Science, 224:73–113.
[Kamp, 1971] Kamp, H. (1971). Formal properties of “now”. Theoria, 37:227–273.
[Lewis, 1968] Lewis, D. K. (1968). Counterpart theory and quantified modal logic. Journal
of Philosophy, 65(5):113–126.
[Meyer, 2009] Meyer, U. (2009). ‘now’ and ‘then’ in tense logic. Journal of Philosophical
Logic, 38(2):229–247.
[Percus, 2000] Percus, O. (2000). Constraints on some other variables in syntax. Natural
Language Semantics, 8:173–229.
[Recanati, 2007] Recanati, F. (2007). Perspectival Thought: A Plea for (Moderate) Relativism. Oxford University Press.
[Saarinen, 1978] Saarinen, E. (1978). Backward-looking operators in tense logic and in
natural language. In Hintikka, J., Niiniluoto, I., and Saarinen, E., editors, Essays on
Mathematical and Philosophical Logic, pages 341–367. Reidel, Dordrecht.
32
Igor Yanovich
[Schlenker, 2003] Schlenker, P. (2003). A plea for monsters. Linguistics and Philosophy,
26:29–120.
[ten Cate, 2005] ten Cate, B. D. (2005). Model theory for extended modal languages. PhD
thesis, ILLC, University of Amsterdam.
[van Benthem, 1977] van Benthem, J. F. (1977). Tense logic and standard logic. Logique
et Analyse, 20:41–83.
[Verkuyl, 2008] Verkuyl, H. (2008). Binary tense, volume 187 of CSLI lecture notes. CSLI
Publications.