Noname manuscript No. (will be inserted by the editor) Expressive power of “now” and “then” operators Igor Yanovich July 9, 2014 Abstract Natural language provides motivation for studying modal backwardslooking operators such as “now”, “then” and “actually” that evaluate their argument formula at some previously considered point instead of the current one. This paper investigates the expressive power over models of both propositional and first-order basic modal language enriched with such operators. Having defined an appropriate notion of bisimulation for first-order modal logic, I show that backwards-looking operators increase its expressive power quite mildly, contrary to beliefs widespread among philosophers of language and formal semanticists. That in turn presents a strong argument for the use of operator-based systems in the semantics of natural language, instead of systems with explicit quantification over worlds and times that have become a de-facto standard for such applications. The popularity of such explicitquantification systems is shown to be based on the misinterpretation of a claim by [Cresswell, 1990], which led many philosophers and linguists to assume (wrongly) that introducing “now” and “then” is expressively equivalent to explicitly quantifying over worlds and times. Keywords “now” operator, backwards-looking operators, bisimulation, first-order modal logic, hybrid logic The purpose of this paper is to study the expressive power that is added to modal logic by the introduction of now and then operators.1 Logically speaking, it is actually not a particularly exciting subject. Once we apply relatively familiar techniques from the modern logical toolkit, it turns out Igor Yanovich Universität Tübingen, Institute of Linguistics Wilhelmstraße 19, Tübingen, 72074 Germany E-mail: [email protected] 1 From here on, I talk only about expressive power over models. For the application to natural language, that is arguably a more important kind of expressivity than expressivity over frames. 2 Igor Yanovich that now and then add extra expressive power only for first-order modal logic, as opposed to propositional modal logic.2 Moreover, in most cases that power is only added when models have an infinite number of individuals. Thus for a pure logician, the interest of this paper would mainly lie in the notion of bisimulation appropriate for first-order (=quantified) modal logic, cf. Def. 8 and Thm. 2. But from the applied point of view, the systems with now and then are extremely important because of their role in the philosophy of language and in formal semantics. In those areas, it is often taken as a proven fact that modal logic with now and then is as expressive as a first-order multi-sortal logic with explicit quantification over times and worlds. But this near-consensus is very different from the actual mathematical state of affairs, as we will show below. Once the wrongful assumption is corrected, there are consequences for how linguists and philosophers of language might want to go about analyzing natural language phenomena. In particular, when the expressive power of now and then is properly characterized, we become able to see more advantages of using operator-based systems for modality and temporality. 1 Introduction Certain expressions of natural language prompted philosophers and linguists to introduce now and then operators which could shift the interpretation of an embedded subformula to a point (that is, world or time) introduced by a higher modal operator. Those expressions may be called backwards-looking operators.3 From the early days of formal semantics, it has become accepted that when such operators are added to quantified (that is, first-order) modal logic, its expressive power increases, as was shown by [Kamp, 1971]. But by how much? Since [Cresswell, 1990], it has also become accepted in the fields of formal semantics and philosophy of language that quantified modal logic enriched with now and then is as expressive as a full many-sortal first-order logic with unrestricted explicit quantification over worlds and times. And that formal understanding (or, in fact, misunderstanding, as we will show) in turn led philosophers and especially linguists to widely popularize the use of such explicit-quantification systems. Explicit-quantification systems have become a de-facto standard to such an extent that it is hard to find contemporary semantic work that would use modal and temporal operators rather than explicit quantifiers over times and worlds. This sometimes leads to curious results: for instance, [Percus, 2000] is an important and well-cited4 work that is dedicated to formulating a binding theory for explicit world variables assumed to populate syntactic repre2 Quantified modal logic, also called first-order modal logic, is related to (propositional) modal logic the same way as first-order logic is related to propositional logic. 3 The term “backwards-looking operators” is due to [Saarinen, 1978]. 4 Currently with about 200 citations in the Google Scholar web service, which is a large number for a linguistic article. Expressive power of “now” and “then” operators 3 sentations for natural language. The main content of that binding theory is as follows: verbs and adverbial modifiers combine with world variables that must be bound by the closest higher modal operator. Importantly, in a system where world variables are only manipulated implicitly through modal operators and backwards-looking operators, no such constraint would be needed: in the absence of an extra operator, such interpretation would be the default. This underscores which kind of problems one runs into upon accepting without argument a more expressive system than one needs to: in a more expressive system, more things can go wrong, and moreover, more additional constraints on the workings of the system are needed. But given the common misinterpretation of Cresswell’s result, this problem has flown under the radar of philosophers and linguists because it was assumed that there is no difference in expressive power between operator-based and explicit-quantification systems. What I aim to achieve in this paper is to bring the debate about explicitquantification vs. operator modal systems for natural language semantics back onto a solid logical ground. The reading of[Cresswell, 1990] that led analysts of natural language to adopt explicit-quantification systems is based on a misunderstanding. Cresswell never proved the result that the subsequent linguistic and philosophical literature took him to have proven. He added now and then operators not to the basic modal language, but to a language with universal modality — which few if any philosophers and linguists would posit for natural language (or, at least, for everyday natural language). In that language, then indeed increases expressivity up to full many-sortal FOL (first-order logic). But as is well-known to modal logicians, universal modality is a very powerful operator itself, cf. [Goranko and Passy, 1992], [Blackburn and Seligman, 1995], [Blackburn and Seligman, 1998], a.o. So Cresswell’s increase in expressive power happens to a system that is already far more powerful than what is currently assumed for natural language (which we will discuss further in Section 7). It is thus improper to apply Cresswell’s results to ordinary investigations of the properties of natural language. But what happens if we analyze the expressive power of now and then properly, namely adding them to the basic modal language ML? (That language would be agreed upon as a proper basis for analysis of natural language.) The current paper answers this question. In the propositional case, no extra power is added by then. In the quantified modal logic case, there is indeed an increase in power, but it is tiny. ML with then operators added is the least expressive language of the hybrid family, and when identity is in the language of quantified modal logic, then’s extra power only manifests itself in models with an infinite number of individuals. Far from going all the way to manysortal first-order logic, a system with now and then is perhaps the mildest of the known systems expanding basic modal logic. What does this mean for applied logicians, such as philosophers and linguists, who use modal logic to analyze natural language? The bottom line is that an operator-based system is arguably superior to the explicit-quantification systems that have become the standard in the field. Operator-based systems 4 Igor Yanovich are less expressive, but the expressivity they carry is already enough to account for the relevant natural language phenomena. Such systems are thus more restrictive and more predictive — the properties which linguists and philosophers value in formal systems. This does not necessarily mean that one should just throw explicit-quantification systems out of the window: it can be that for some purposes, they would be more intuitive to use. But what the results presented in this paper show is that such systems are far from innocent, and thus ought to be used with due caution. The plan of the paper is as follows. In Section 2, I introduce important datapoints from natural language that motivate the introduction of backwardslooking operators. I also demonstrate how one would translate such natural language sentences into a formal language using both an operator-based and an explicit-quantification based systems. This section thus provides the applied motivation for the logical study to follow. In Section 3, I introduce languages Cr and CrF O resulting from the addition of generalized backwards-looking operators to the basic modal language ML and its quantified version MLF O . This section provides the definitions for the formal systems whose expressivity we will be studying. In Section 4, I provide a truth-preserving translation from the fragment of propositional Cr that only features genuinely backwardslooking then, into ML. The existence of such translation shows that then operators do not actually increase expressive power in the propositional case. Section 5 turns to the case of first-order modal logic with now and then. We introduce a notion of bisimulation appropriate to quantified MLF O , and with its help prove that CrF O is strictly more expressive, but at the same time that extra expressivity only kicks in in a limited number of cases — in particular, when the domains of individuals are infinite. Section 6 closes the logical part of the paper: it it, I situate the Cr languages within the family of hybrid languages. It turns out that Cr is the mildest member of that clan. Finally, in Section 7 we return to the applied use of backwardslooking operators, discussing how the expressivity claim by [Cresswell, 1990] was misinterpreted in the linguistic and philosohical literature, and what the practical consequences of learning the actual expressivity results may be. 2 Backwards-looking operators of natural language, and their formal analysis In this section, I introduce natural language examples of the kind used to motivate the introduction of now and then operators. Then show how those examples can be analyzed with such operators, and also, alternatively, in a full many-sortal first-order logic with explicit quantification over worlds and times. Mary [who is reading now] came. (1) In 1, “now” is embedded within a relative clause and signals that the embedded predicate “is reading” should be evaluated at the current time, outside Expressive power of “now” and “then” operators 5 of the scope of the matrix past tense. Assuming operator now with appropriate semantics, we can represent the sentence with P (come(m) ∧ now(read(m))), where P is the past operator. There is also a equivalent logical representation for 1 without now: read(m) ∧ P (come(m)), but while semantically correct, it does not respect the constituent structure of the original natural language sentence. (One day in the future,) everyone [now alive] will be dead (2) Unlike for 1, for 2 there is no translation into quantified modal logic without a now operator. Or, rather, strictly speaking we will only be able to prove that after we derive the results in Section 5, but even in the absence of a formal proof, it has been widely accepted as fact for decades that 2 is inexpressible unless we add now (or allow explicit quantification over times).5 It was the case that everyone [then alive] would all be dead one day Everyone [actually tall] might have been short You might have considered yourself short while [actually being tall] (3) (4) (5) While “now” in 1 and 2 shifts evaluation back to the matrix time, “then” in 3 shifts it to the moment introduced by the higher past tense. In 4, the word “actually” forces the predicate “tall” to be evaluated at the actual world. In 5, the same word returns the evaluation to the counterfactual world introduced by the higher “might have been” operator, similarly to how “then” in 3 refers back to the past moment introduced by a higher operator. In all of the cases above, natural language expressions “now”, “then” and “actually” shift the evaluation index to some index that was used while evaluating the higher levels of the sentence. This can be the matrix index as in 1, 2 and 4, or an index introduced by a higher temporal or modal operator, as in 3 and 5. In both cases, those words may be said to be looking back into the series of indices introduced earlier, and shifting the interpretation of their argument formula to one of the previously used indices and away from the current one. In an operator-based system, we would account for such natural language expressions as follows. We would introduce a family of logical backwardslooking operators theni (which is exactly what we do formally in the next section). The idea would be that theni would shift the interpretation to the i-th evaluation index from the ones that were used earlier. The special case of now would be defined as then0 that would always go back to the initial evaluation time. Then we would analyze 2 as follows: [[now]] = λPet . now(P ) (6) [[alive now]] = λxe . now(alive(x)) (7) [[everyone now alive will be dead]] = F (∀x : now(alive(x)) → dead(x)) (8) 5 Contrary to that, [Verkuyl, 2008, pp. 130-132] argues that now is semantically superfluous in modal logic. But Verkuyl does not discuss any quantificational examples like 2 which pose a true expressivity problem, and only considers sentences such as 1 for which now is indeed semantically superfluous. 6 Igor Yanovich In an explicit-quantification system, there are certain design choices to be made when implementing “now” and “then”. In most current systems used by formal semanticists (cf. [Percus, 2000] for an example), predicates such as “alive” have special syntactically represented slots for time and world variables. Usually such a slot would be filled by a postulated covert constituent introducing an explicit time or world variable. There are several options as to what exactly an adverb such as “now” (for the temporal case) or “actually” (in the world-variable case) would be doing in this setup. If “now” or “then” would modify the predicate after it has combined with the covert variable, it would have to force abstraction over that variable. Such a system would look as follows:6 ∧t φ := expression that denotes the temporal intension of φ (9) [[now]] = λPet .∧t P (t0 ), where t0 must refer to the current time (10) [[alive now]] = λxe .λt3 .(alive(x)(t3 ))(t0 ) = λxe .alive(x)(t0 ) (11) [[2]] = ∃t1 t0 : (∀x : (alive(x)(t0 ) → dead(x)(t1 )) (12) So what “now” does on such an analysis is essentially erasing any effect of the explicit syntactic time variable that combines with “alive”: the result is as if there was never such a variable in the first place. Another version of the explicit-quantification story would go as follows: we would say that “now” directly denotes a temporal variable, and that it occupies the same syntactic slot that is normally occupied by covert explicit temporal variables. This would result in the following analysis: [[now]] = t0 , where t0 must refer to the current time [[alive now]] = λxe .alive(x)(t0 ) [[2]] = ∃t1 t0 : (∀x : (alive(x)(t0 ) → dead(x)(t1 )) (13) (14) (15) An immediate problem with this analysis is that empirically “now” seems to be a syntactic adjunct rather than an argument. For instance, it may occur both on the left and on the right of the modified expression: “everyone now alive” and “everyone alive now” are both OK. So some syntax-semantic interface story would have to be told about how come “now” would always fit into the proper slot for temporal variables — but let’s assume for the sake of the argument that such a story may somehow be told. Comparing the operator-based and the explicit-quantification lines of analysis, what can we say? Both lines would have to make a stipulation about the 6 Perhaps more in the spirit of current explicit-quantification systems, in particular of the branch of formal semantics called LF semantics, would be the following alternative. There would be no intension operator ∧t ; now would take as arguments functions from times; there will be a rule of freely applying λti operators; and finally, a constraint would force the explicit variable next to “alive” to be bound by the closest λ-restrictor specifically when “alive” is modified by “now”, but not otherwise. The problem with such an account is that this last constraint is, so to speak, non-compositional: the variables on predicates like “alive” should generally not be subject to the closest-binder requirementm and only when we know that that predicate is in the scope of “now”, would a different constraint be imposed. Expressive power of “now” and “then” operators 7 fact that now must specifically refer to the current time, so on this count the two are equal. Beyond that, the operator story works right away (assuming the operators are properly defined, of course — but we will define a formal system with them right in the next section.) The explicit-quantification story cannot stop yet, though: it has to introduce a number of syntactic assumptions.7 This difference in the amount of additional work is actually related to a genuine difference in the expressive power of the two underlying formal systems. A system with backwards-looking operators, as we will show below, is a very mild logical system. It only increases the expressivity of basic firstorder modal logic by a tiny bit. As a consequence, there are plenty of things that we cannot express in such a system — but the good news is that for analysis of natural language, we do not even want to express them. In contrast to that, the explicit-quantification system is very powerful. And as the result of that, we need to constrain its behavior in order to tailor it to the observed facts. Hence all the additional constraints on the binding theory of world and time variables, and a fair number of further issues to be resolved. For instance, why doesn’t natural language have operators that have meaning ∃ti , without any connection to the current evaluation time? After all, it is often assumed in that line of theorizing that we have freely applying λti operators... If natural language is as expressive as a full many-sortal first-order logic, then the absence of this, and many other kinds of meanings, is mysterious, and needs to be explained with yet further constraints. What this comparison tries to demonstrate is ultimately that expressive power matters. Normally, a linguist or a philosopher of language would not be interested in the issues of expressivity, and for a good reason: if natural language demands a very expressive system, then we as analysts cannot help it, and would have to adopt it. That would just be an empirical fact about human language. But with backwards-looking operators, we have a different kind of case: the now-standard tools for treating them are vastly more expressive than is actually required by natural language data. In such a case, moving back to a less expressive system may would give to us greater explanatory adequacy. So from the applied point of view, the task of this paper is to develop the logical theory that shows why and how exactly the operator-based story is more restrictive than the explicit-quantification story. Sections 3-6 below will take care of that logical part, and then in Section 7, we will return to the application to natural language semantics. I tried my best to make the logical part accessible for a linguist or a philosopher with applied interests in mind (to the possible frustration for the logician readers, for which I apologize; textbook-level explanations have only been included for those topics 7 In fairness, some of those assumptions would be “independently justified”, in the sense that for many intensional phenomena of natural language, we would need similar ones anyway. For example, adding an extra binding-theoretic constraint specifically for time variables in the scope of “now” and “then” is not such a wild idea if we already adopted a number of such constraints anyway. But if we do not have to introduce any of this apparatus for restricting the enormous expressive power of the full FOL in the first place, that’s a different story. 8 Igor Yanovich that are not widely known among linguists and philosophers.) But in case I failed nevertheless, the main technical results are informally reformulated at the beginning of Section 7 for the reader’s convenience. 3 Adding backwards-looking operators to ML In this section, we define languages that formalize backwards-looking operators, Cr and CrF O (for Cresswell, as our system is very close to his system with “now”, “then” and “actually”). Cr and CrF O result from enriching the basic modal language ML and its quantified counterpart MLF O with backwards-looking operators. It should be clear how to add such operators to other underlying modal languages (e.g., languages having more than one ♦). Definition 1 (The syntax of Cr) Let P ROP be a non-empty set of propositional variables, e.g. p, q, ... Then wff-s of Cr are: φ := P ROP | > | ¬φ | φ ∧ ψ | ♦φ | thenk (φ), where k ∈ N. ⊥, ∨, →, and are defined as usual, and now := then0 . Formulas of ML are evaluated in a Kripke model, —consisting of the domain W of points, an accessibility relation R, and a valuation function V ,— at a point from W . Informally, backwards-looking operators thenk shift the evaluation of their argument formula back to some point considered earlier. In order to return to such points, we need to store them, and we do that in denumerable evaluation sequences ρ of points from the domain W of model M . Formulas of Cr are evaluated at pointed sequences hρ, ii, where the i-th member of ρ, also written ρ(i), functions as the current evaluation point in standard Kripke semantics. We call ρ1 and ρ2 n-variants (in symbols, ρ1 ∼n ρ2 ) if for any m 6= n we have ρ1 (m) = ρ2 (m). Definition 2 (The semantics of Cr) For Kripke model M = hWM , RM , VM i, sequence ρ from WM , and i ∈ N, M, hρ, ii |=Cr q iff ρ(i) ∈ VM (q) M, hρ, ii |=Cr > always M, hρ, ii |=Cr ¬φ iff it is not the case that M, hρ, ii |=Cr φ M, hρ, ii |=Cr φ ∧ ψ iff M, hρ, ii |=Cr φ and M, hρ, ii |=Cr ψ M, hρ, ii |=Cr ♦φ iff there is ρ0 ∼i+1 ρ s.t. ρ(i)Rρ0 (i + 1) and M, hρ0 , i + 1i |=Cr φ M, hρ, ii |=Cr thenk (φ) iff M, hρ, ki |=Cr φ Expressive power of “now” and “then” operators 9 The non-modal clauses of our semantics do exactly the same job as the corresponding standard clauses, with ρ(i) playing the part of the current point. The clause for ♦, in addition to the standard truth conditions, also produces “side effects”: it writes down the R-accessible point to which ♦ shifts the evaluation as the next member of the sequence, and stores the previous evaluation point for future use. As these side effects only affect then-operators, the following easily follows: Proposition 1 For all φ ∈ Cr that are also in ML, M, hρ, ii |=Cr φ iff M, ρ(i) |=ML φ then operators shift the pointer to a different member of ρ. When the shift is to a point stored earlier, thenk functions as a genuine backwards-looking operator. But if we never “overwrote” ρ(k) while evaluating our formula by the time we encounter thenk , the point that we shift to is determined by the ρ we started with. Evaluation sequences thus work pretty much like assignment functions, and we can think of thenk operators as implicitly introducing a variable over points. When thenk retrieves a previously stored ρ(k), the implicit variable is bound by a higher ♦. When thenk accesses ρ(k) determined by the initial evaluation sequence, the implicit variable is free. A formula of Cr is a sentence iff, evaluated at hρ, 0i, it only depends on the point ρ(0). In other words, the truth of a sentence of Cr is semantically relative to a single point, while the truth of a non-sentence is relative to multiple points. In yet other words, a sentence of Cr would have no implicit free variables over points.8 Note that it is crucial that we restrict our attention to formulas evaluated at hρ, 0i: whether a Cr formula features an implicit free variable depends on the initial index. Thus ♦♦ then1 (p) would not use points not introduced by ♦s when evaluated at hρ, 0i, but it would do so when evaluated at hρ, 2i: in that case the point ρ(1) would not have been overwritten by the clause for ♦. Summing up, ♦ then3 (p) is not a sentence, and ♦♦ then1 (p) is. We will write Crsent for the sentence fragment of Cr. For a sentence φ, we say φ is true at ρ in M if φ is true at hρ, 0i in M . When ρ(0) = w, we can also say that sentence φ is true at w, as we do for ML. When the context makes it clear which model is to be used, we may suppress it. A standard technique in modal logic is to relate the modal language we are working with to the first-order logic whose domain is points/worlds/times. That technique uses the so-called standard translation which maps modal operators to FOL operators in a specially defined language. E.g., for propositional variables pi in modal logic, the corresponding language will have corresponding 8 It is usual to give a syntactic definition of a sentence, where a sentence is a formula without free variables. Semantically, such a formula does not depend on the assignment of values to variables. It is easy to give a purely syntactic definition of a Cr sentence, but I find that the semantic definition in the main text makes the intuition behind the notion more prominent. 10 Igor Yanovich 1-place predicates Pi over worlds. The points of a Kripke model may be referenced in L0 using individual variables xi , and the accessibility relation is represented by a 2-place predicate R. (See, e.g., [Blackburn and van Benthem, 2007, Sec. 2.2] or [Blackburn et al., 2001, Ch. 2.4] for an introduction.) The standard translation of standard modal logic ML essentially shows that in a sense, ML is just a special notation for a particular fragment of FOL: the formulas that may be output by the translation are a very small subset of the full FOL. In particular, such FOL formulas will all feature exactly one free variable (corresponding to the initial evaluation world), and all new variables xj will always be introduced using a link to an already introduced point xi , by a construction ∃xj : xi Rxj , corresponding to ♦. To study the properties of our system Cr, we can easily extend the standard translation: STi (p) = P (xi ) STi (>) = > STi (¬φ) = ¬STi (φ) STi (φ ∧ ψ) = STi (φ) ∧ STi (ψ) STi (♦φ) = ∃xi+1 (xi Rxi+1 ∧ STi+1 (φ)) STi (thenk (φ)) = STk (φ) The standard translation for ML only requires two variables over points.9 However, the translation for Cr may require any finite number of variables. The corresponding fragment of FOL is thus much greater for Cr than for ML. However, we will see in the next section that despite appearances, for any sentence of Cr there is an equivalent ML formula, so standard translations of Cr sentences are equivalent to formulas in the two-variable fragment of FOL. We now turn to the quantified language CrF O : it is the quantified version that is needed to adequately model meanings of NL sentences such as 1-5. Syntactically, CrF O is obtained by using a supply of individual variables xk and n-place relation symbols q instead of just propositional variables (which are retained as 0-place relation symbols), and quantifier ∀ over individuals. In addition to that, we may or may not want to add an existence predicate E and identity of individuals. Definition 3 (The syntax of CrF O ) Let {P REDn }, for finite n, be a collection of sets P REDn , each containing n-ary predicate symbols, with at least one P REDN non-empty; let V AR be an infinite supply of individual variables x0 , x1 , ...., also written as x, y, z, ...; and let E be the optional existence predicate. Then the wffs of CrF O are defined as follows: φ := q(x0 , ..., xn−1 ) | ∀xφ | > | ¬φ | φ ∧ φ | ♦φ | thenk (φ), where q ∈ P REDn , and k ∈ N. 9 That only two variables are needed for the standard translation of ML was noted by [Gabbay, 1981]. The case of two-variable logics is special. See, e.g., [Grädel and Otto, 1999] on semantically two-variable logics and corresponding two-pebble games. Expressive power of “now” and “then” operators 11 Optionally, Ex and xi = xj may be well-formed wffs. There are many design options when it comes to the semantics of quantified modal logic (see [Fitting and Mendelsohn, 1998], [Blackburn et al., 2007, Ch. 9]). For domain semantics, I choose varying domain semantics wherein there are no restrictions on the relations of individual domains at different points — the most general setting possible. For quantifiers, I use untensed quantifiers ∀ (also called possibilist), which range over all individuals in the model regardless of which points they exist at. In the special case when the language has existence predicate E, we can also definetensed, or actualist, quantifiers ∀tensed using untensed ∀ and E: ∀tensed xφ := ∀x(Ex → φ) (see [Cresswell, 1991]). Of course, tensed quantifiers can also be defined as primitive.10 A first-order Kripke model with varying domains for CrF O is a structure hW, R, {δw∈W }, {Vw∈W }i, where each δw is the individual domain of the point w ∈ W , and Vw is a valuation relative to w, that is, a function from predicate symbols in P REDn to n-ary relations over point w’s individual domain δw . S For convenience, we also define the domain D of all individuals as w∈W δw . Formulas of CrF O are evaluated in a first-order Kripke model M at a pointed sequence hρ, ii relative to an assignment h of individuals to individual variables. For the interpretation of the modal component the presence of the assignment function h does not make a difference: we just pass it down. We call two assignment functions h and h0 x-variants, h ∼x h0 , iff they agree on all variables but x. Instead of Cr’s propositional variable clause M, hρ, ii |= q, we have two clauses for predicates and for the universal quantifier. The defined semantics is bivalent: any q is false of a tuple if it contains individuals not existing at the current point. All omitted clauses are as for Cr. Definition 4 (Varying domain semantics for CrF O ) M, h, hρ, ii |=CrF O q(x̄) iff hh(x1 ), ..., h(xn )i ∈ Vρ(i) (q) M, h, hρ, ii |=CrF O ∀xφ iff ∀h0 s.t. h0 ∼x h, we have M, h0 , hρ, ii |= φ M, h, hρ, ii |=CrF O E(x) M, h, hρ, ii |=CrF O xi = xj iff iff h(x) ∈ δρ(i) h(xi ) = h(xj ) As CrF O has both explicit variables over individuals and implicit thenvariables over points, we have two notions of sentencehood: a wff φ of CrF O is a then-sentence iff, evaluated at hρ, 0i, it only depends on the point ρ(0). (The notion of then-sentence is thus parallel to the notion of sentence for Cr.) Furthermore, a CrF O then-sentence is a CrF O sentence iff it has no free individual variables. We say that a then-sentence is true at ρ if hρ, 0i makes 10 Another design choice is whether to add any sort of counterpart theory, cf. [Lewis, 1968]. Counterpart theory is often used to identify individuals at different points when point domains are disjoint, but can be added to any other kind of domain semantics as well. I will refrain from discussing counterpart theories altogether. See [Fara, 2008] and references therein for combining a counterpart theory with ‘now’ and ‘actually’. 12 Igor Yanovich it true. We may suppress M and h for brevity when that is safe to do. We will O FO write CrF . sent for the then-sentence fragment of Cr Extending the standard translation of Cr to a translation CrF O into a two-sortal corresponding first-order language is straightforward. Now the stage is set. In the next section, we will show that the sentence fragment Crsent is expressively equivalent to ML by building an effective truth-preserving translation. Then in Section 5, we will define bisimulations for MLF O , and on the basis of that show that CrF O is genuinely more expressive than MLF O . Finally, in Section 6, we will show which place CrF O occupies in the expressive hierarchy of hybrid languages. Put together, we will have a theory of just how much expressive power adding “now” and “then” operators adds to a logic, and why that additional expressive power only arises in quantified modal logic, and crucially depends on models with infinite domains of individuals. 4 Eliminating then-operators in the propositional case Special cases of eliminating backwards-looking operators in propositional modal systems have been discussed in the literature, cf. [Kamp, 1971], [Meyer, 2009]. In this section, we provide a truth-preserving translation from the sentence fragment Crsent of Cr into its underlying language ML (or, actually, two such translations). The existence of such translations shows that when we add “now” and “then” to modell natural language operators, in the propositional case it does not actually increase the expressive power of modal logic. As long as we do not have quantification over individuals, no increase in expressive power occurs — unlike in the case when instead of “now” and “then”, we introduce explicit quantification over worlds and times. (Of course, if we consider non-sentences of Cr, they are relative to more than a single point, and standard ML cannot express such meanings. But such cases are not what is usually taken to justify the linguistic and philosophical practice of using explicit quantification over worlds and times.) What “bound” thenk -operators in a Cr sentence do, is shift the evaluation back to ρ(k) introduced by some higher ♦. We will provide two translations allowing us to eliminate thenk by bringing its argument to be in the immediate scope of the relevant ♦. One translation is local, and works by “floating” thenk up one operator at a time until it reaches the level where we can eliminate thenk . The other translation is global, introducing at the level of the “binder” ♦ two disjoined cases, one for when φ is true at ρ(k), another for when it is false. Both translations are complex: the first one in the worst case involves length increase exponential in the length of the translated sentence; the second involves length blow-up exponential in the number of thenk φ subformulas. We start with the local translation as it allows us to better illustrate the working of the Cr system. The reader not interested in such illustrations may safely skip to Thm. 1 and its second proof on p. 15. Expressive power of “now” and “then” operators 13 For the first, local translation, we want to “float” each thenk ψ into the immediate scope of the ♦ that introduces ρ(k) to which thenk ψ is to be evaluated. Our first task then is to determine how we can transform Cr formulas while preserving their truth. Unlike in standard modal logic, with then-operators safety for substitution is determined relative to the evaluation sequence and the syntactic context in which substitution occurs. Thus φ may be safe to substitute for ψ in wff ξ1 , but not in wff ξ2 . For a simple example, consider ♦ then1 p and then1 p. If both are evaluated at hρ, 0i, we can substitute then1 p with just p in the first formula, but not in the second. So we will need to get a handle on such cases where substitution is OK. Many substitutions, however, are always safe. It is easy to check that all ML validities define valid substitutions in Cr: no matter the context, (p∨q)∧r is always equivalent to (p∧r)∨(q∧r) in Cr. Similarly, the following equivalences hold regardless of the context, as can be easily checked from the truth clauses for Cr: ¬ thenk (φ) ⇔ thenk (¬φ) thenk (φ ∧ ψ) ⇔ (thenk φ) ∧ (thenk ψ) thenl (thenk (φ)) ⇔ thenk (φ) (16) (17) (18) But none of those allows us to “float” a thenk operator past a ♦. What we need is to determine when the following semi-equivalence (∼) is valid: ♦(thenk (φ) ∧ ψ) ∼ thenk (φ) ∧ ♦ψ (19) In some cases, e.g. 20, a substitution that instantiates the schema in 19 results in an equivalent formula. But in other cases, e.g. 21, it does not. In fact, the left formula in 21 is a sentence, while the right one is not: it depends on ρ(2), not only on ρ(0). ♦♦(then0 (p) ∧ q) = ♦(then0 (p) ∧ ♦q) ♦(then1 (p) ∧ q) 6= then1 (p) ∧ ♦q (20) (21) For our purposes, the following simple case where the schema in 19 works will suffice: Lemma 1 Let Cr sentence ξ contain an occurrence of ♦(thenk (φ)∧ψ), where (1) φ contains no then operators, and (2) for index i at which ♦(thenk (φ)∧ψ) would be interpreted in ξ, k 6= (i + 1). Let ξ 0 be the result of substituting that occurrence with thenk (φ) ∧ ♦ψ. Then if ξ 0 is a sentence, it is equivalent to ξ. Proof Suppose that ♦(thenk (φ) ∧ ψ) is true at hρ, ii. Then first, there is a point v s.t. ρ(i)Rv and ψ is true at ρ0 ∼i+1 ρ where ρ0 (i + 1) = v. Second, φ is true at ρ0 (k), and as it contains no then operators, its truth does not depend on the rest of ρ0 . As k 6= (i + 1), φ is also true at ρ(k), making the first conjunct of thenk (φ) ∧ ♦ψ true at hρ, ii. The existence of v makes the second conjunct true as well. In the other direction, the equivalence is as easy. a 14 Igor Yanovich For the translation, we’ll need one more simple fact: when thenk φ is in a sentence, and it would be evaluated at index k when the sentence is evaluated at some hρ, 0i, then thenk can be safely eliminated: it simply shifts the interpretation index from k to itself. Now we are ready to prove the following by building a local translation that floats each thenk up one step at a time until it is eliminated: Theorem 1 (Translation from Crsent into ML) For each sentence ξ ∈ Cr, there is a ξ 0 ∈ ML such that M, hρ, 0i |=Cr ξ iff M, ρ(0) |=ML ξ 0 . Moreover, ξ 0 can be effectively computed from an arbitrary ξ ∈ Crsent . Proof of Thm. 1 (local translation) We define a (local, bottom-up) translation from an arbitrary ξ to ξ 0 . When ξ is a sentence, all thenk operators return evaluation to a point introduced by a higher ♦. If we float each thenk (φ) up into the immediate scope of that ♦, we can then safely eliminate thenk . We start with an arbitrarily chosen thenk that has no other thenl in its scope, and float it one ♦ up. After each such step, we check if we can eliminate the moved thenk because we reached the immediate scope of the ♦ that thenk referred back to. If the check is positive, we eliminate that occurrence of thenk . (Before the first step, of course, we need to check if we can eliminate any thenk right away.) Fix such a thenk φ where φ does not contain then-operators. That thenk φ would be within subformula ♦(... thenk φ...) where the scope of ♦ is a nonmodal formula. (If there is no such ♦, then our k = 0, and we can eliminate thenk right away.) If in (... thenk φ...), our thenk φ is embedded under another thenl , we apply the equivalences for negation and then-distribution, 16 and 17, to get to a configuration thenl thenk φ. At this point, we apply 18 to eliminate thenl . If there are more higher then operators, we repeat the procedure until we transform ♦(... thenk φ...) so that there are no other thenl between thenk and ♦. Then we normalize the resulting non-modal formula into the disjunctive normal form, treating all then-formulas as propositional variables. After that, we apply modal equivalence ♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ, and obtain a subformula where thenk φ may only be embedded under ¬ or ∧. For ¬, we apply 16. Finally, embedding under ∧ does not prevent us from applying Lemma 1; in fact, if thenk φ is not embedded under ∧, we need to apply the equivalence ψ ⇔ (ψ ∧ >) to create a subformula that meets the conditions of Lemma 1. Finally, we move thenk φ over ♦ using that lemma. If thenk may now be eliminated, we do that, and otherwise we repeat. Reapplying the same procedure, we can always select a then-formula to be moved one ♦ up, so eventually we will be able to eliminate all of them, obtaining an ML formula ξ 0 as desired. a Here is how the translation works in one particular case: ♦♦p ∧ then0 (q ∨ ¬ then1 r) Expressive power of “now” and “then” operators 15 ♦♦p ∧ (then0 q ∨ then0 ¬ then1 r) (17)&(16) ♦♦p ∧ (then0 q ∨ then0 then1 ¬r) (16) ♦♦(p ∧ (then0 q ∨ then1 ¬r)) (18) ♦♦((p ∧ then0 q) ∨ (p ∧ then1 ¬r)) → DNF ♦(♦(p ∧ then0 q) ∨ ♦(p ∧ then1 ¬r)) ♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ ♦(♦(p ∧ then0 q) ∨ (then1 ¬r ∧ ♦p)) Lemma 1 ♦((♦p ∧ then0 q) ∨ (¬r ∧ ♦p)) then1 elimination ♦(then0 q ∧ ♦p) ∨ (¬r ∧ ♦p)) Lemma 1 ♦(then0 q ∧ ♦p) ∨ ♦(¬r ∧ ♦p)) ((then0 q) ∧ ♦♦p) ∨ ♦(¬r ∧ ♦p)) (q ∧ ♦♦p) ∨ ♦(¬r ∧ ♦p)) ♦(φ ∨ ψ) ⇔ ♦φ ∨ ♦ψ Lemma 1 then0 elimination The complexity of the translation is high because of the normalization step which in the worst case leads to exponential blow-up of the formula length. If, on the other hand, we apply a global translation to be defined below instead of the local translation above, there would be guaranteed exponential formula growth, but only in the number of thenk φ subformulas: Proof of Thm. 1 (global translation) Consider Cr sentence ...♦(ψ(thenk φ))..., where (1) φ is a formula of ML; (2) ψ(thenk φ) is a formula of Cr (thus possibly containing more then operators); and (3) the shown ♦ is the one to which thenk φ refers back to. When that sentence is evaluated, the ♦ introduces the point ρ(k). As φ contains no then-operators, it is ρ(k) alone that determines whether thenk φ will amount to > or ⊥; the rest of the sequence is irrelevant. We thus have two cases, one where thenk φ = > and another where thenk φ = ⊥ relative to ρ(k). In the first case, ψ(thenk φ) amounts to ψ(>), and in the second case, it amounts to ψ(⊥). The original sentence is thus equivalent to the following: ...♦((φ ∧ ψ(>)) ∨ (¬φ ∧ ψ(⊥))). As in the local translation case, for an arbitrary Cr sentence ξ, we can repeat this procedure, each time selecting thenk φ with φ not containing other then operators, and eventually we obtain ξ 0 ∈ ML. a Here is an example of the global translation applied to the same Cr sentence as above (with some simplification steps added for readability): ♦♦p ∧ then0 (q ∨ ¬ then1 r) then1 elimination ♦[(r ∧ ♦p ∧ then0 (q ∨ ¬>)) ∨ (¬r ∧ ♦p ∧ then0 (q ∨ ¬⊥)] simplifying using propositional validities ♦[(r ∧ ♦p ∧ then0 q) ∨ (¬r ∧ ♦p)] then0 elimination 16 Igor Yanovich (q ∧ ♦[(r ∧ ♦p ∧ >) ∨ (¬r ∧ ♦p)]) ∨ (¬q ∧ ♦[(r ∧ ♦p ∧ ⊥) ∨ (¬r ∧ ♦p)]) simplifying using propositional validities (q ∧ ♦♦p) ∨ (¬q ∧ ♦[¬r ∧ ♦p]) Each elimination of thenk φ leads to a roughly 2-fold increase of the substituted ψ(thenk φ), and we need as many such operations as there are distinct thenk φ subformulas in ξ. The translation is thus exponential in the number of then-operators. It is interesting if a less-than-exponential translation can be given, but very simple translations are unlikely to exist.11 To sum up, as non-sentences of Cr contain implicit free variables over points, it is trivial to find a Cr formula that cannot be expressed in ML. For instance, p ∧ then1 (¬p) can distinguish a model that contains a p-point and a non-p point not connected by the accessibility relation, while ML cannot do that, as a simple bisimulation argument can show. But if we only consider the fragment Crsent , where thenk only work as genuine backwards-looking operators, then Thm. 1 shows that the extra operators do not increase the range of meanings that the language can express. In the next section we will see that once we move from Cr to CrF O , that will change: adding backwards-looking operators to quantified (i.e. first-order) modal logic leads to a genuine increase in expressivity even within the sentence fragment. 5 Bisimulation for quantified modal logic Since [Kamp, 1971], it is known that backwards-looking operators are not eliminable in quantified modal logic: Kamp presents a sentence of MLF O + now that has no equivalent now-less sentence. Our task in this section is not to just prove that CrF O is more expressive than MLF O (Kamp’s proof is sufficient for that), but rather to pin down the exact amount of new expressivity which thenk operators bring in when added to quantified modal logic. We will use a standard tool that the modern modal logic uses for studying expressivity of modal languages: bisimulations. A bisimulation corresponding to a particular modal language L is a relation between the domains of two L-models such that if two points are bisimilar, then they are indistinguishable by any formula of L. Bisimulation may be informally thought of as a relaxed version of isomorphism. Two isomorphic models cannot be distinguished no matter what. Two bisimilar models, under a fixed notion of bisimulation, cannot be distinguished by a particular logical language, though in a more expressive language we may be able to tell them apart. Thus with a suitable notion of bisimulation in hand, it becomes easy to prove expressivity results. For instance, after we show that all bisimilar points, 11 See [ten Cate, 2005, Prop. 3.3.3], who shows that there is no polynomial normalization for hybrid @-operators, close cousins of our then-operators (cf. Sect. 6 on the relation between the two kinds). Expressive power of “now” and “then” operators 17 under a fixed notion of bisimulation, are indistinguishable by language A, it suffices to show that language B can distinguish some of such points to prove that B is more expressive. For a textbook-level review of bisimulations for propositional modal logic, see [Blackburn et al., 2001, Ch. 2]. Defining appropriate notions of bisimulation for richer propositional modal languages has become a routine step in modal-logical model-theoretic investigations (cf., e.g., [Areces et al., 2001], [ten Cate, 2005], [Areces et al., 2011]). But what we need to pin down the difference between MLF O and CrF O is a notion of bisimulation for a firstorder modal language, and to my knowledge, such a notion so far has not been introduced in the literature. It will thus be worth spending some time on how exactly we can arrive at the right notion. Consider standard propositional bisimulation first (and the reader interested in the new results may skip directly to Def. 7 and 8, and Thm. 2): Definition 5 (Bisimulation for ML) A bisimulation E between two Kripke models M and N is a non-empty relation in W M × W N with the following properties: Propositional Harmony: If wEw0 , then for any propositional symbol p, M, w |= p iff N, w0 |= p Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 ) Zag: If wEw0 and ∃v 0 (w0 RN v 0 ), then ∃v(wRM v ∧ vEv 0 ) Points w ∈ M and v ∈ N are called bisimilar if there is a bisimulation E such that wEv. Models M and N are called bisimilar if there exists a bisimulation between them. It is not hard to see why any two bisimilar points must be indistinguishable in ML. Suppose that we need to find out whether we are at w ∈ M or at v ∈ N , with w bisimilar to v, and that our only way of getting information is by testing for truth ML formulas at our current point. If we check the truth of propositional formulas, by Propositional Harmony and easy induction the results will be the same at w and v, so that doesn’t help. Now suppose we are actually at w, and we check if ♦φ is true. If it is, there is some accessible w0 where φ is true. But then by Zig, in N there is also an accessible v 0 bisimilar to w0 where φ is true. By induction on φ, we will never find out whether we are at w or at v. Thus ML is invariant under bisimulation. (Again, consider [Blackburn et al., 2001, Ch. 2] for formal proofs.) Bisimulation is much more relaxed than isomorphism. E.g., the following models are bisimilar, though clearly not isomorphic: Example 1 Bisimilar, but not isomorphic models ?>=< 89:; w M ?>=< 89:; v1 g ' N ?>=< 89:; v2 18 Igor Yanovich ML cannot distinguish M and N of Ex. 1, but first-order logic can: the formula ∃u2 (u1 Ru2 ∧ u1 6= u2 ) is false at w and is true at v1 and v2 . So while ML is invariant over bisimulations, its corresponding FO language is not. The corresponding language is thus more expressive. What should the notion of bisimulation appropriate for MLF O look like? It is clear that Zig and Zag from the propositional case should be preserved. It is also clear that instead of requiring Propositional Harmony, we need to at the very least require “FOL harmony”: any bisimilar points should have the same non-modal theories (that is, they should make true exactly the same sets of formulas without modal operators). This leads us to the notion of FOL bisimulation. As we will see shortly, this notion is not yet quite adequate, but nevertheless it is useful as a first approximation: Definition 6 (FOL bisimulation) A FOL bisimulation E between two first-order Kripke models M and N is a non-empty relation in W M × W N with the following properties: FOL Harmony: If wEw0 , then for any φ ∈ FOL, M, w |= φ iff N, w0 |= φ Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 ) Zag: If wEw0 and ∃v 0 (w0 RN v 0 ), then ∃v(wRM v ∧ vEv 0 ) Note that φ may contain free variables. Thus for any tuple ā of individuals at point w, at any bisimilar w0 there should be a corresponding tuple b̄ making precisely the same non-modal formulas true. However, FOL bisimulation does not ensure that such corresponding tuples would make the same sets of modal formulas true. Example 2 Mismatch of individuals q:a ¬q : b 89:; ?>=< w / 89:; ?>=< v M q:a q:c ¬q : b ¬q : d GFED @ABC w0 / ?>=< 89:; v0 q:d ¬q : c N Consider relation E = {hw, w0 i, hv, v 0 i} between M and N from Ex. 2. At all four points in the two models, the non-modal formulas ∃xq(x) and ∃x¬q(x) are true, and it is easy to see that FOL harmony is satisfied. Furthermore, Zig and Zag are also satisfied. Relation E is thus a FOL bisimulation. But the MLF O formula ∃x(q(x) ∧ ♦q(x)) is true at w in M , but false at w0 in N . FOL bisimulation ensures that the internal FOL-theories of bisimilar points are the same, but it does not require that “harmony between individuals” holds across points. That is why we could easily distinguish between M and N from Ex. 2: we used the fact that for a at w, there is no corresponding a0 at w0 which would satisfy exactly the same modal formulas in one free individual variable. To define a proper notion of bisimulation for MLF O , we need to make sure that for each tuple of individuals at a point, at a bisimilar point there is a corresponding tuple which makes exactly the same MLF O formulas true. Expressive power of “now” and “then” operators 19 It suffices that the correspondent make exactly the same non-modal formulas true at each modal path: Definition 7 (Modal paths) A modal path is a finite string of diamonds from the language. For w1 , w2 points in model M , a non-empty path π = ♦i1 ...♦in leads from w1 to w2 (in symbols, w1 πw2 ) iff there exist points vi1 , ..., vvn−1 s.t. w1 Ri1 vi1 ∧ ... ∧ vin−1 Rin w2 . For the empty modal path Λ, by definition, ∀w : wΛw. Definition 8 (FOL path bisimulation) A FOL path bisimulation E between two first-order Kripke models M and N is a non-empty relation in W M × W N with the following properties: FOL path harmony: (i) If wEw0 , then for any finite tuple ā in DM , there is b̄ in DN such that for any modal path π, if ∃v ∈ W M to which path π leads from w, then ∃v 0 ∈ W N such that w0 πv 0 and for any formula φ ∈ FOL, M, v |= φ[ā] iff N, v 0 |= φ[b̄]. Similarly for any b̄ at w0 in N . We write ā ! b̄ for such correspondent tuples. (ii) When ā ! b̄ at w and w0 , it must be possible to extend those tuples to corresponding (ā, a1 ) ! (b̄, b1 ). Zig: If wEw0 and ∃v(wRM v), then ∃v 0 (w0 RN v 0 ∧ vEv 0 ). Zag: If wEw0 and ∃v 0 (w0 RN v 0 ) , then ∃v(wRM v ∧ vEv 0 ). Returning to Ex. 2, we can see that w and w0 are FOL-bisimilar, but not FOL-path-bisimilar. There is no individual at w0 that could be a FOL-pathharmony correspondent of a from w: c is no good because there is no point accessible by the path ♦ where c satisfies q(x), while d does not satisfy q(x) at the empty path Λ. Theorem 2 If E is a FOL path bisimulation between M and N , and wEw0 for w ∈ M , w0 ∈ N , then for any φ of MLF O , there is a tuple ā such that M, w |= φ[ā] iff there exists a tuple b such that N, w0 |= φ[b̄]. Proof Suppose towards contradiction that there exists φ such that there is ā for which M, w |= φ[ā], but for all b̄, N, w0 6|= φ[b̄]. We fix some FOL-pathcorrespondent b̄ of ā, and thus have a pair of correspondents only one of which makes φ true. The proof goes by gradually disassembling φ so that we can finally derive a contradiction at the level of non-modal formulas. There are the following cases: φ = ¬φ0 , φ = (φ0 ∧ φ00 ), φ = ∀xφ0 , and φ = ♦φ0 . If φ = ¬φ0 , we have ā for which M, w 6|= φ0 [ā], but for its correspondent b̄ that we fixed, N, w0 |= φ0 [b̄]. Exchanging the roles for ā and b̄, we can now consider φ0 . For φ = φ0 ∧ φ00 , we have M, w |= φ0 ∧ φ00 [ā], but N, w0 6|= φ0 ∧ φ00 [b̄] where ā ! b̄. That means that either N, w0 6|= φ0 for a restriction b¯0 of b̄ to the individuals substituted into φ0 , or similarly N, w0 6|= φ00 [b¯00 ]. — We can show that the restrictions ā0 and b¯0 must be correspondents just as ā and b̄. Suppose that is not so, and ā0 6! b¯0 . Then by definition of FOL path harmony, there are some π and (non-modal) ψ that M, w |= πψ[ā0 ], but 20 Igor Yanovich N, w0 6|= πψ[b¯0 ]. Without loss of generality, let ā0 be the initial segment of ā, and let there be n elements in the non-ā0 part of ā. Then we can build formula ξ := ψ ∧ (p(x1 ) ∨ ¬p(x1 )) ∧ ... ∧ (p(xn ) ∨ ¬p(xn )). As we only added tautologies to ψ, we have that M, w |= πξ[ā], but N, w0 6|= πξ[b̄]. But that is contrary to assumption that ā ! b̄. Thus all restrictions of corresponding tuples are also FOL-path-correspondents. — Returning to φ0 ∧ φ00 , we note that either there are correspondent restrictions ā0 and b¯0 of ā and b̄ which disagree on φ0 , or similarly for φ00 . We then consider φ0 and φ00 . If φ = ∀xφ0 , we have M, w |= ∀xφ0 [ā], but N, w0 6|= ∀xφ0 [b̄]. We pick some extension (b̄, b1 ) such that N, w0 6|= φ0 [(b̄, b1 )]. By clause (ii) of FOL path harmony, we should be able to extend ā to some (ā, a1 ) that is correspondent to (b̄, b1 ). As we have M, w |= φ0 [(ā, a1 )] for any a1 , we now consider φ0 , (ā, a1 ), and (b̄, b1 ). When φ = ♦φ0 , we move to φ0 and R-accessible v ∈ M and v 0 ∈ N thanks to the Zig and Zag conditions. And finally, when φ is non-modal, and we have M, w |= φ[ā], but N, w0 6|= φ[b̄], that directly contradicts FOL path harmony given that ā ! b̄. a It follows from Thm. 2 that when two points are FOL-path-bisimilar, then they are indistinguishable in MLF O . Thm. 2’s converse does not hold in the general form: as is well-known, the converse fails for propositional ML, and that result carries over to MLF O . Thus there can be MLF O -models that are indistinguishable in the language, but nevertheless not FOL-path-bisimilar.12 Note that whether two models are bisimilar depends on the particular language used. E.g., M and N from Ex. 3 are FOL-path-bisimilar if identity is not in the language, and are not FOL-path-bisimilar if identity is included. 12 However, we can provide an analogue of the Hennessy-Milner theorem that states that the converse holds for a particular class of models. For ML, that is the class of image-finite models: those where every point has only a finite number of R-successors for each R. The propositional proof shows that the relation of modal equivalence is itself a bisimulation in this case. The condition of image-finiteness allows the following argument to come through: suppose that w and w0 are bisimilar, but for some v : wRv, there is no bisimilar v 0 : w0 Rv 0 . 0 , there is φ s.t. v 0 |= φ , but u0 6|= φ . As the set of all u0 is finite, Then for each u0i : w0 RuV i i i i i i we can build formula ♦ i φi , which is true at w thanks to the existence of v, but is false at 0 w . This is contrary to assumption. In the case of MLF O , we need not only the assumption of finiteness for successor points, but also for domains of individuals. The argument for individuals would be along the following lines. Suppose that there are indistinguishable w and w0 where ā at w has no correspondent b̄ at w0 . Then we collect all pairs of π and φ that witness that a particular b̄ does not correspond to ā, and V as there is only a finite number of distinct b̄s, we can collect them into one large formula ∃x̄ i πi φi (x̄). At w, tuple ā ensures that this formula is true, but at w0 by construction there is no b̄ that would witness that. But then w and w0 have different MLF O theories, contrary to assumption. When the number of distinct tuples is not finite, we cannot gather all π and φ into a single formula, hence the converse to Thm. 2 would not hold in such a case. Expressive power of “now” and “then” operators 21 Example 3 FOL-path-bisimilar models distinguishable by CrF O 89:; q : a, b, c ?>=< r9 u ¬q : d r r r rrr rrr r r rr q : a, b, c 89:; ?>=< w L LLL ¬q : d LLL LLL LLL L% q : a, b 89:; ?>=< v ¬q : c, d q : a0 , c0 @ABC GFED u0 8 q ¬q : b0 , d0 q qqq q q q qqq qqq q : a0 , b0 , c0 GFED @ABC 0 w MM ¬q : d0 MMM MMM MMM MMM & ?>=< q : a0 , b0 89:; v0 ¬q : c0 , d0 M N When there is no identity in the language, we have individuals of just three kinds at both w and w0 , and it’s easy to check that M, w and N, w0 are FOL-path-bisimilar: a, b; c; d; a0 : b0 , c 0 : d0 : q(x) ∧ ♦q(x) ∧ ¬♦¬q(x) q(x) ∧ ♦q(x) ∧ ♦¬q(x) ¬q(x) ∧ ♦¬q(x) ∧ ¬♦q(x) But if identity were in the language, then, e.g., a and a0 would not have been FOL-path-harmony correspondents: formula ∃y∃z(x 6= y ∧ x 6= z ∧ y 6= z ∧ ♦(q(x) ∧ q(y) ∧ q(z))) is made true by a, but not by a0 . Assuming a language without identity, FOL-path-bisimilar w and w0 from Ex. 3 cannot be distinguished by MLF O by Thm. 2. From the fact that CrF O sentence 22 is true at w and false at w0 , we immediately derive the expressivity result in Prop. 2. ♦∀x(now(q(x)) → q(x)) (22) O Proposition 2 MLF O ( CrF sent . O FO O Proof For MLF O ⊂ CrF formula is in CrF sent , every ML sent . As w and w0 in Ex. 3 are FOL-path-bisimilar, and 22 is true at w, but not O at w0 , by Thm. 2 we have MLF O 6= CrF sent . a Thus propositional Crsent is as expressive as ML (Thm. 1), but quantified FO O CrF (Prop. 2). We will now connect sent is strictly more expressive than ML those two results, showing why the addition of backwards-looking operators leads to greater expressivity over points only through greater expressivity over individuals. We say that tuple ā at w in M has property φ ∈ CrF O iff for some assignment h, M, h, w |= φ[ā]. Formula φ defines the set of tuples of individuals which have property φ. Similarly we can talk about properties of points, tuples of points, and of ordered pairs of a tuple of points and a tuple of individuals. As all formulas of MLF O are also formulas of CrF O , all properties expressible in MLF O are trivially expressible in CrF O . But in addition to those, 22 Igor Yanovich CrF O may also express properties of tuples of individuals relative to more than one point. For instance, the subformula now(q(x)) → q(x) of 22 defines a two-point property of individuals that are either ¬q(x) at the now-point, or q(x) at the current point. Using that property, we can distinguish w and w0 from Ex. 3. If we start with w as the now-point, it is possible to choose a ♦-accessible point so that a set of individuals defined by now(q(x)) → q(x) is equal to the whole domain of individuals. But if we start with w0 in N as the now-point, there is no choice of a ♦-successor which would allow that. That is precisely why CrF O may distinguish between w and w0 with the formula 22.13 In fact, we can show that only such thenk φ may increase expressivity where φ contains free individual variables. That is demonstrated by Prop. 3, an easy generalization of Thm. 1. O Proposition 3 For ξ ∈ CrF sent , if every thenk (φ) in ξ has no free individual variables, then ξ has an equivalent MLF O formula. Proof We adapt the global translation eliminating then-operators used in the second proof of Thm. 1 on p. 4. Recall that in the global then-eliminating translation, we exploited the fact that when ♦ to which thenk φ with φ ∈ ML refers back to, introduces point ρ(k), the truth of thenk φ depends only on that ρ(k). Given that fact, we were able to translate ...♦(ψ(thenk φ))... into ...♦[(φ ∧ ψ(>)) ∨ (¬φ ∧ ψ(⊥))].... To adapt that translation to our case here, we only note that when φ ∈ MLF O is a closed formula, the truth of thenk φ only depends on ρ(k), just as in the propositional case. We can therefore apply the same procedure. a It is thus no coincidence that sentence 22 which we used to prove that FO O , included subformula now(q(x)) with CrF sent is more expressive than ML a free individual variable. By Prop. 3, no sentence of CrF O without such a subformula can express a meaning not expressible by MLF O . O FO Though CrF , it is easy to see that for sent is more expressive than ML many pairs of models adding then-operators to the language does not allow us to distinguish models indistinguishable in MLF O . For instance, in Ex. 4 K is O not distinguishable from L in either MLF O or CrF sent without identity, and K and M are not distinguishable by either even if identity is in the language. (Of course, if we consider non-sentences of Cr, there would be formulas satisfiable in M , but not in K.) Example 4 89:; q : a ?>=< w ?>=< q : a1 , a2 89:; u K L ?>=< q : a3 89:; v ?>=< 89:; v 0 q : a4 M 13 This informal characterization of the difference which introducing then-operators makes is similar to the one given by [Meyer, 2009]. Expressive power of “now” and “then” operators 23 I finish this section with a result underscoring the fact that the extra expressive power of CrF O is a relatively mild addition to MLF O : it turns out that for languages with identity, then can make a difference only in models with infinite domains. Proposition 4 If the basic language has identity, and MLF O model M has a O FO 0 finite individual domain, then for any ξ ∈ CrF such sent there exists ξ ∈ ML 0 that for any ρ and h, M, h, hρ, 0i |=CrF O ξ iff M, h, ρ(0) |=MLF O ξ . (Note that ξ 0 is relative to a specific M .) Proof The proof is a modification of the global translation of Thm. 1 and Prop. 3. Suppose we fixed thenk φ(x) to be eliminated, where φ does not contain other then operators, and there is one free individual variable x. (We consider cases with more variables below.) When we consider ...♦ψ(thenk φ(x))... where ♦ is the one which thenk refers back to, there are two possibilities: either x within φ remains free in ψ(thenk φ(x)), or it is bound within ψ. If x remains free, h(x) is not altered as we go down the formula from ψ to φ, and therefore we simply apply the global translation step as in Prop. 3. The interesting case is when x gets bound within ψ, namely when we are dealing with ψ = (...∀x(... thenk φ(x))...). At the hρ, ki at which ψ gets evaluated, the formula φ(x) will be true for some x and false for others. When we use thenk , we can “refer back” to those truth values for φ(x) at hρ, ki. To get rid of thenk , we “record” the individuals that make φ true at hρ, ki at the top level of ψ, and “export” variables referring to them for further use down in ψ. Then for the quantifier ∀x, instead of quantifying directly into φ(x), we provide two cases: one where the relevant individual is one of those that made φ true at hρ, ki, and the other where it didn’t. Here is how we do this: Let the number of individuals in model M be n. We define an abbreviation ∃φ,m,x as standing for the following: ∃x0 ...∃xm−1 ((x0 6= x1 ∧ ... ∧ xm−2 6= xm−1 ) ∧ (φ(x0 ) ∧ ... ∧ φ(xm−1 )) ∧ (¬∃xm (xm 6= x0 ∧ ... ∧ xm 6= xm−1 ∧ φ(xm )). In words, ∃φ,m,x records that there are exactly m distinct individuals that make φ true at the current index, and exports those individuals for future use in variables x0 , ..., xm−1 . Now we translate our ...♦(...∀x : ... thenk φ(x)...)... as follows: ...♦( _ 0≤i≤n ∃φ,i,x [...∀x : ( _ 0≤j≤i x = xj ) → ...>...) ∧ ( ^ x 6= xj ) → ...⊥...)])... 0≤j≤i The quantifier ∀x in our formula still checks the truth of its scope for each x. If x is equal to one of the xi s which made φ true at hρ, ki, we substitute > instead of φ(x). If x does not belong to that group, we substitute ⊥. In each case, we get exactly what we would have got if we evaluated thenk φ(x) in its original place, but without thenk . However, we can only do that if we have a finite, and known, number of individuals in the model: otherwise the huge disjunction that we need to build would not be finite either. 24 Igor Yanovich For thenk φ with multiple free variables bound within ψ, the disjunctions get even more complex, as we need to “store” in new variables not just single individuals, but tuples that make φ true. I illustrate for the 2variable case. Let our ψ be (...∀x...∀y... thenk φ(x, y)...). We define an abbreviation ∃φ,hm,l0 ,...,l(mx −1) i,x,y . Number m records the number of x for which there is some y with which they make φ true. Numbers l0 ...lmx −1 record for each such x the exact number of ys that make φ true in a pair with that x. The abbreviated operator then is defined as follows: ∃x0 ∃y0,0 ...∃y0,l0 : ...∃xm−1 ∃y(m−1),0 ...∃y(m−1),l(m−1) : (x0 6= x1 ...) ∧ (y0,0 6= y0,1 ...y0,(l0 −2) 6= y0,(l0 −1) ) ∧ ... ∧ (y(m−1),0 6= y(m−1),1 ...y(m−1),(l(m−1) −2) 6= y(m−1),l(m−1) −1 ) ∧ [φ(x0 , y0,0 ) ∧ ... ∧ φ(x0 , y0,(l0 −1) ) ∧ ¬∃y0,l0 : y0,l0 6= y0,0 ∧ ... ∧ y0,l0 6= y0,(l0 −1) ∧ φ(x0 , y0,l0 )] ∧ ... ∧ ¬∃xm : (xm 6= x0 ...) ∧ ∃z : φ(xm , z). What this operator does is record all and only pairs that make φ true as hx0 , y0,0 i...hx0 , y0,(l0 −1) i and so forth. We translate ...♦(...∀x : ...∀y... thenk φ(x, y)...)... as follows: ...♦( W 0≤i,i0 ,...,i(i−1) ≤n ∃φ,hi,i0 ,...,i(i−1) i,x,y : V 0≤j≤i (x = xj → W V (...∀y : (( 0≤k≤ij y = yj,k ) → ...>...) ∧ ( 0≤k≤ij y 6= yj,k ) → ...⊥...) ) V ∧ ( 0≤j≤i (x 6= xj ) → ...∀y : (...⊥...) ) ) )... ( ...∀x : It should be clear how to modify the translation step for any particular number of variables in φ bound from within ψ. The resulting formula will be a daunting but truth-preserving substitute for the original ψ(... thenk φ...). a From Prop. 4 we derive a simple corollary that shows that then operators only change the expressivity (of the sentence fragment of the language) for models with infinite domains: Corollary 1 If the language has identity, then finite FOL-path-bisimilar modO els M and N cannot be distinguished by any ξ ∈ CrF sent . Proof Suppose there is such ξ which is true at M, w, but false at N, w0 with w bisimilar to w0 . Let n be the cardinality of the greater of DM and DN . By Prop. 4, we can build ξ 0 ∈ MLF O equivalent to ξ in M and N . But from Thm. 2, there can be no such ξ 0 . a What about infinite models and CrF O with identity? Ex. 5 shows that O FO with an infinite number of individuals, CrF . sent is more expressive than ML Example 5 FOL-path-bisimilar models for MLF O with identity that O can be distinguished by CrF sent Expressive power of “now” and “then” operators q : a0 , a1 , ... q : a0 , a1 , ... ¬q : b0 , b1 , ... ¬q : b0 , b1 , ... ?>=< 89:; w / 89:; ?>=< u 25 q : c0 , ... 0 @ABC GFED 8 u0 ¬q : d , ..., e , ... r r 0 0 rrr q : c0 , ..., d0 , ... rrr r r r ¬q : e0 , ... rrr @ABC GFED w0 LL LLL LLL LLL LLL & 0 q : d0 , ... @ABC GFED u1 ¬q : c0 , ..., e0 , ... M N At both w and u in M , there is an infinite number of a-s being q, and of b-s being ¬q. Thus all individuals that are q at w are also q at u. At w0 , there are three infinite sets of individuals: c-s, d-s and e-s. All c-s and d-s make ♦q(x) true, but there is no point accessible from w0 where both c-s and d-s are q at the same time. But as the extensions of properties q and ¬q at u00 and u01 are infinite, we cannot register the difference between u and u00 or u01 using MLF O : those points are FOL-path-bisimilar. At the same time, the familiar CrF O formula 22, namely ♦∀x(now(q(x)) → q(x)), is true at w, but false at FOL-path-bisimilar w0 . 6 Cr languages and hybrid languages The Cr languages with then operators that we introduced are close cousins to hybrid languages that have received considerable attention in the literature since the early 1990s. Basic hybrid language HL is the basic modal language enriched with nominals: propositional variables of a special sort that may only be true at a single point in any model. Nominals i, j, ... can be used as terms, just as propositional variables do, or they can be bound by different hybrid operators. We provide below the syntax and semantics for language HL(@, ↓). Languages HL(@) and HL(↓) feature only one of the hybrid operators defined below. In addition, we will use ML + @ + ↓ to refer to the language that is like HL(@, ↓) except that nominals do not occur as atoms in its formulas. For an introduction into these and other hybrid languages, see [Blackburn and Seligman, 1995], a.o. Definition 9 (The syntax of HL(@, ↓)) For P ROP a set of propositional variables, and N OM a set of nominal variables, and i ∈ N OM , the wffs of HL(@, ↓) are: φ := P ROP | N OM | > | ¬φ | φ ∧ ψ | ♦φ | @i.φ | ↓i.φ Formulas of HL(@, ↓) are evaluated in a Kripke model M at a point w relative to an assignment g of points to nominal variables. The nominal assignment function g may be viewed as a storage device for references to points: ↓i stores the current point as the value of variable i; @i retrieves the value recorded in i from the storage to evaluate the argument formula at that value. An atomic occurrence of i tests whether the current point is the one stored in i. We say 26 Igor Yanovich that ↓i binds the occurrences of @i and i in its scope, and that non-bound occurrences are free. An HL(@, ↓) formula is a sentence iff its truth does not depend on g, which happens exactly when there are no free occurrences of @i or i. Definition 10 (The semantics of HL(@, ↓)) As usual, g ∼i g 0 iff for any j 6= i we have g(j) = g 0 (j). M, g, w M, g, w M, g, w M, g, w M, g, w M, g, w M, g, w M, g, w |= q |= i |= > |= ¬φ |= φ ∧ ψ |= ♦φ |= ↓i.φ |= @i.φ iff w ∈ V (q) iff w = g(i) always iff it is not the case that M, g, w |= φ iff M, g, w |= φ and M, g, w |= ψ iff there is w0 s.t. wRw0 and M, g, w0 |= φ iff for g 0 s.t. g 0 ∼i g and g 0 (i) = w, M, g 0 , w |= φ iff M, g, g(i) |= φ It is easy to define polynomial truth-preserving translations between Cr and ML + @ + ↓. From Cr to ML + @ + ↓, we only need to add ↓ik in the scope of each ♦, and replace thenk with @ik referring back to the appropriate ↓. For example: ♦♦(p ∧ then1 q) 7−→ ↓i0 .♦ ↓i1 .♦(p ∧ @i1 .q) In the other direction, we can replace every @i with theni referring back to the ♦ in whose immediate scope the binder ↓i occurred. For example: ♦↓i.(♦p) ∨ ♦(q ∧ @i.q) 7−→ ♦((♦p) ∨ ♦(q ∧ then1 q)) Thus Cr and ML + @ + ↓ are essentially notational variants. What distinguishes all full-fledged HL languages from Cr are atomic occurrences of nominals. We will now briefly discuss how the expressive power of HL(@), HL(↓) and HL(@, ↓) relates to that of Cr. First, Cr is clearly not more expressive than any HL language: Cr has no formulas that have to be true at exactly one point in any model. So Cr may be either strictly weaker or expressively incompatible with any of the HL languages. Moreover, as the sentence fragment of Cr is equivalent to ML, which is the underlying language for all HL languages, Crsent is strictly less expressive than any full HL language. At the same time, the sentence fragments of HL(@) and HL(↓) also collapse to ML. HL(↓) and Cr = ML + @ + ↓ are mutually expressively incomparable. HL(↓) can store points; it can also test, with nominals used as atoms, whether we are at a point that we previously stored. But HL(↓) cannot return the evaluation to a previously stored point. As the result, it cannot express a formula like p ∧ then1 q. As for HL(@), it is strictly more expressive than Cr. All free thenk operators may be replaced by corresponding @ik operators, but in addition to those, HL(@) has atomic nominals. Expressive power of “now” and “then” operators 27 HL(@, ↓) is more expressive than HL(@), and therefore than Cr. Moreover, even the sentence fragment of HL(@, ↓) is more expressive than the sentence fragment Crsent : sentence ↓i.¬i of HL(@, ↓) defines R-irreflexivity of the current point, and as any sentence of Crsent is equivalent to a formula of ML, and no ML formula can define R-irreflexivity, the sentence fragment of HL(@, ↓) is strictly more expressive than Crsent . (Cf. bisimilar, and therefore ML-indistinguishable, models from Ex. 1, only one of which has irreflexive points.) So what is the place of Cr in the family of modal/hybrid languages? As [Areces et al., 2001] show, HL(@, ↓) is an important system, being the logic for meanings invariant under taking generated submodels: any FOL formula in that class is equivalent to a formula of HL(@, ↓). The FOL fragment corresponding to HL(@, ↓) is the bounded fragment, where the domain of quantification over points is restricted to R-accessible points. Our simple analysis above shows that Cr is one of the less expressive inhabitants of the group of modal/hybrid languages invariant over generated submodels. Figure 6 is the “expressivity map” of that group. In the diagram, A → B means that A is strictly less expressive than B. n nnn nnn n n n nv nn Cr = ML + @ + ↓ ML 77 77 77 77 77 77 77 77 7 HL(@) EE EE EE EE EE EE EE EE E" HL(↓) HL(@, ↓) Fig. 1 Cr and its hybrid cousins. And if we look only at the sentence fragments of the relevant languages (i.e. without unbound then, @i and i), Crsent , HL(@)sent and HL(↓)sent all collapse to the regular ML, while HL(@, ↓)sent remains more expressive. 7 Philosophical and linguistic consequences Let me sum up the main formal results we derived above: – When added to propositional modal logic, now and then operators that are genuinely backwards-looking do not add any extra expressive power.14 14 When not genuinely backwards-looking, those operators essentially act as unbound hybrid @i operators. 28 Igor Yanovich – When added to first-order modal logic, now and then do bring in extra expressivity. – However, if we have identity of individuals in the logical language, the extra expressivity only manifests itself in infinite models. – Finally, the amount of extra expressivity is tiny: basic modal logic with backwards-looking operators is a particularly mild case of a hybrid logic invariant over generated submodels. Importantly, even the most expressive such logic, namely HL(@, ↓), is vastly less expressive than full many-sortal FOL with explicit quantification over worlds and times. If the actual amount of expressivity added by now and then is so tiny, then why do so many linguists and philosophers assume that such operators amount to explicit quantification over worlds? This widespread erroneous assumption is due to a misreading of [Cresswell, 1990], which we will now discuss. Here is how the system of [Cresswell, 1990] is set up. Its formulas are evaluated at sequences of points. The initial member of the sequence is always used as the current point, and operators Refn and thenk respectively store and retrieve the current point from the k-th member of the sequence. Cresswell’s Refk is thus essentially hybrid ↓ik (which we introduces in the previous section), and Cresswell’s thenk is @ik . So if we add Cresswell’s operators to the basic modal language, we get ML + ↓ + @, which, as we have shown above, is a notational variant of our Cr. Cresswell famously argues that when we add “now”, “then” and “actually” operators to the language, that amounts to introducing “explicit quantification” over worlds and times. By that, he means that his system with Refk and thenk is as expressive as the corresponding many-sortal FOL language. But we have just seen that ML+↓+@ and Cr are even less expressive than HL(@, ↓), which itself is equivalent to the bounded fragment of the correspondence FOL language, not the whole thing! How come? Is our expressivity analysis wrong, or is Cresswell’s? In fact, both analyses are correct. They are not contradictory. The secret is in the underlying language. We added then operators to the very basic underlying language, namely ML. In constrast to that, Cresswell uses a richer basic language: he includes into it universal modality A, which he writes as . He defines φ to be true when φ is true at every point in the model, without any regard for R-accessibility (see his (15) on p. 8.) In addition to universal modality, Cresswell also uses operator L for the familiar, R-restricted of ML. But universal modality is a very powerful operator in the hybrid family, as discussed, a.o., by [Goranko and Passy, 1992], [Blackburn and Seligman, 1995], [Blackburn and Seligman, 1998]. Having universal modality A together with ↓ and @ (in other words, Cresswell’s Ref and then), we can easily define unrestricted quantification over points, which indeed brings us all the way up to the full expressivity of FOL: ∀wφ can be defined as ↓j.A(↓i.@j.φ). But importantly, “now” and “then” operators alone are not capable of that. It is the Expressive power of “now” and “then” operators 29 universal modality A that is essential for making the language not invariant under generated submodels. Cresswell himself does not stress that his claim is only valid for a language with A. Unfortunately, that led to serious misinterpretation of his result (possibly because later researchers did not realize that Cresswell’s stands not for the usual modal , and not for A).15 To give a typical example from the philosophical literature, [Recanati, 2007, pp. 61-62] writes: “It has been established that the full expressive power of firstorder logic is needed to deal with natural-language tenses. (See e.g. van Benthem 1977; Cresswell 1990: chs. 2-4.)”16 It is striking that Recanati’s remark is made within a discussion of whether to use an ML-based or FOL-based system for translating natural language. Misinterpreting Cresswell’s claim leads to the adoption of a wrong assumption crucial for Recanati’s argument. That argument undoubtedly would have looked differently had it been clear that simply adding “now” and “then” does not automatically result in full FOL expressivity. In the linguistic literature, the misinterpretation of Cresswell’s claim also led to widespread adoption of the belief that the full power of FOL is absolutely required to account for natural language modal operators and tenses. This is all the more striking given that linguists rarely assume that natural language has true unrestricted quantification, without which Cresswell’s result becomes inapplicable. For instance, [Schlenker, 2003] writes (p. 99) that “the full power of quantification over times and worlds is needed to analyze temporal and 15 [Meyer, 2009] is another philosophical take on the problem of expressivity of “now” and “then” operators. Meyer (p. 229) aims to show “that now and then are eliminable in quantified tense logic, provided we endow it with enough quantificational structure.” What he means by “enough quantificational structure” is including into the language a set membership predicate and existential quantification over sets. With that much, we can easily express Russell’s paradox: ∃s(s 6∈ s). Even if one does not worry about paradoxes, adding quantification over sets we become able to express second-order properties (e.g., distinguish the standard model of Peano arithmetic from non-standard models), thus going a long way beyond the expressive resources of FOL. Using set theory, Meyer can indeed eliminate “now” and “then” operators. One wonders, however, if such a solution “deals with the problem in the most natural way”, as Meyer puts it (p. 242). As we have seen, “now” and “then” operators actually make the basic modal language only slightly more expressive, so it is hardly surprising that by going as far as second-order expressivity we can become able to eliminate them. Both Cresswell and Meyer aim to trivialize the contribution of “now” and “then” operators: Cresswell argues that such operators increase the expressive power of the language up to that of FOL, and Meyer argues that in a language that can express second-order properties, “now” and “then” are redundant anyway. The framework we developed in this paper allows us to understand the actual contribution of such operators without trying to fully reduce them to some familiar, and vastly more expressive, system. Our model-theoretic analysis allows us to distinguish cases where the addition of then makes a difference from cases where it doesn’t. 16 The actual position of [van Benthem, 1977] is in fact nothing of the sort. As van Benthem himself puts it (p. 436): “From a technical point of view, tense logics could be considered to be sublogics of predicate logic. <...> But, as tense logics become stronger and stronger (containing ever more exotic operators), predicate logic itself becomes a serious rival as regards elegance and simplicity” (emphasis mine). 30 Igor Yanovich modal talk in English”, citing [Cresswell, 1990]. At the same time, on the very next pages Schlenker notes (pp. 100-101) that it is hardly possible to find natural language expressions that would express unrestricted quantification. From this brief discussion, it should be clear how much damage the misinterpretation of Cresswell’s claim has made within both philosophy of language and formal semantics. It should be stressed that the particular examples of [Recanati, 2007] and [Schlenker, 2003] have been chosen not as the worst, but as some of the best and most explicit attempts to reach a better understanding of the modal and temporal expressivity of natural language. The misconstrued version of Cresswell’s claim is a part of the two fields’ folklore by now, and should not be attributed to any particular author personally. I hope that this paper would make it easier to finally correct that mistake, and replace folklore with solid logical arguments. To sum up, though Cresswell’s claim is valid for the particular very expressive language he considers, it is invalid in the general case. Until one proves that natural language requires unrestricted quantification over times or worlds, one should not appeal to Cresswell’s claim. Moreover, if we consider more restricted underlying languages than Cresswell’s, the amount of expressive power added by “now”, “then” and “actually” turns out to be pretty mild, as we saw in Sections 5 and 6. What are then the practical consequences of correcting this misreading of Cresswell for a linguist or philosopher of language? Contrary to the widespread beliefs, systems with backwards-looking operators turn out to be genuinely less expressive, and therefore more restrictive and predictive than systems with explicit quantification over wolrds and times. That in turn means that unless we find new arguments for adopting explicit-quantification systems, we might want to try to live within the means of operator-based ones. That, of course, is not to say that operator-based systems are inherently better: after all, their expressive power ultimately depends on the kind of operators they feature, and there are indeed sets of modal operators that make the system as expressive as many-sortal FOL (e.g., Cresswell’s system with universal modality and what we would have now called hybrid ↓ and @ is such an example.) But systems which only feature backwards-looking operators are in fact very mild expressively. Thus when we would be choosing between the analysis for natural language sentences as in 23 on the one hand and 24 on the other (repeated here from 8 and 12), expressive power should be taken into account. [[everyone now alive will be dead]] = F (∀x : now(alive(x)) → dead(x)) (23) [[everyone now alive will be dead]] = ∃t1 t0 (∀x : (alive(x)(t0 ) → dead(x)(t1 ))(24) Similarly, when we are discussing whether there exist syntactically represented covert variables over times and worlds, we should consider not just the fact that there is currently no syntactic evidence for their existence, but also that on top of that the same interpretational work may be performed by backwards-looking operators. And unlike the explicit-quantification systems, Expressive power of “now” and “then” operators 31 operator-based ones may be defined very restrictively from the start, closely matching the actual expressivity of natural language that has been observed. Thus expressive power should be paid attention to not as a matter of some axiom, but simply because when a formal language is less expressive, it is more predictive. And as good practice within linguistics and philosophy dictates, other things being equal, more predictive and restrictive systems are to be preferred. The contribution of the current paper to the debate is then that we have been able to pin down the logically exact amount to which a system with now and then is more restrictive and predictive than systems with explicit quantification over worlds and times. References [Areces et al., 2001] Areces, C., Blackburn, P., and Marx, M. (2001). Hybrid logics: Characterization, interpolation and complexity. The Journal of Symbolic Logic, 66(3):977–1010. [Areces et al., 2011] Areces, C., Figueira, D., Figueira, S., and Mera, S. (2011). The expressive power of memory logics. Review of Symbolic Logic, 4(2):290–318. [Blackburn et al., 2001] Blackburn, P., de Rijke, M., and Venema, Y. (2001). Modal Logic, volume 53 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press. [Blackburn and Seligman, 1995] Blackburn, P. and Seligman, J. (1995). Hybrid languages. Journal of Logic, Language and Information, 4:251–272. [Blackburn and Seligman, 1998] Blackburn, P. and Seligman, J. (1998). What are hybrid languages? In Kracht, M., de Rijke, M., Wansing, H., and Zakharyaschev, M., editors, Advances in Modal Logic, volume 1, pages 41–62. CSLI Publications, Stanford. [Blackburn and van Benthem, 2007] Blackburn, P. and van Benthem, J. (2007). Modal logic: a semantic perspective. In [Blackburn et al., 2007], chapter 1. Elsevier. [Blackburn et al., 2007] Blackburn, P., van Benthem, J. F., and Wolter, F., editors (2007). Handbook of modal logic, volume 3 of Studies in logic and practical reasoning. Elsevier. [Cresswell, 1990] Cresswell, M. (1990). Entities and Indices. Kluwer, Dordrecht. [Cresswell, 1991] Cresswell, M. (1991). In defense of the barcan formula. Logique et Analyse, 135-136:271–282. [Fara, 2008] Fara, D. G. (2008). Relative-sameness counterpart theory. The Review of Symbolic Logic, 1(2):167–189. [Fitting and Mendelsohn, 1998] Fitting, M. and Mendelsohn, R. L. (1998). First-order modal logic, volume 277 of Synthese library. Kluwer, Dordrecht. [Gabbay, 1981] Gabbay, D. M. (1981). An irreflexivity lemma with applications to axiomatizations of conditions on linear frames. In Mönnich, U., editor, Aspects of Philosophical Logic, pages 67–89. Reidel, Dordrecht. [Goranko and Passy, 1992] Goranko, V. and Passy, S. (1992). Using the universal modality: Gains and questions. Journal of Logic and Computation, 2(1):5–30. [Grädel and Otto, 1999] Grädel, E. and Otto, M. (1999). On logics with two variables. Theoretical Computer Science, 224:73–113. [Kamp, 1971] Kamp, H. (1971). Formal properties of “now”. Theoria, 37:227–273. [Lewis, 1968] Lewis, D. K. (1968). Counterpart theory and quantified modal logic. Journal of Philosophy, 65(5):113–126. [Meyer, 2009] Meyer, U. (2009). ‘now’ and ‘then’ in tense logic. Journal of Philosophical Logic, 38(2):229–247. [Percus, 2000] Percus, O. (2000). Constraints on some other variables in syntax. Natural Language Semantics, 8:173–229. [Recanati, 2007] Recanati, F. (2007). Perspectival Thought: A Plea for (Moderate) Relativism. Oxford University Press. [Saarinen, 1978] Saarinen, E. (1978). Backward-looking operators in tense logic and in natural language. In Hintikka, J., Niiniluoto, I., and Saarinen, E., editors, Essays on Mathematical and Philosophical Logic, pages 341–367. Reidel, Dordrecht. 32 Igor Yanovich [Schlenker, 2003] Schlenker, P. (2003). A plea for monsters. Linguistics and Philosophy, 26:29–120. [ten Cate, 2005] ten Cate, B. D. (2005). Model theory for extended modal languages. PhD thesis, ILLC, University of Amsterdam. [van Benthem, 1977] van Benthem, J. F. (1977). Tense logic and standard logic. Logique et Analyse, 20:41–83. [Verkuyl, 2008] Verkuyl, H. (2008). Binary tense, volume 187 of CSLI lecture notes. CSLI Publications.
© Copyright 2026 Paperzz