First-Order Logic - Carnegie Mellon School of Computer Science

Outline
2
First-Order Logic
Klaus Sutner
1
First-Order Logic
2
Tarski and Truth
3
Model Checking
Carnegie Mellon University
Fall 2014
Battleplan
3
A Suitable Logic
4
So how do we construct a language suitable for (some part of) mathematics
and theoretical computer science? Let’s start with a small fragment, say,
arithmetic. A reasonable approach would be to use only
Our next goal is to use finite state machines to solve certain decision problems.
More precisely, we want to describe structures by finite state machines and
then find algorithms that answer queries about these structures.
basic arithmetic concepts (addition, multiplication, order) and
purely logical constructs such as “and”, “not”, “there is” etc.
To this end we need to define a logic that can be used to express the queries.
So we need a syntax for the formulae and a clear definition of the semantics.
The logical constructs will include all of propositional logic so we can still
combine assertions by connectives “and”, “or” and the like. But there will also
be quantifiers that allow one to make statements about the existence of objects
with certain properties and about properties of all objects.
We are not much interested in proof theory at this point, our algorithms have
nothing to do with theorem proofing.
This type of system is called first-order logic (FOL), also called predicate logic
(but note that we allow for functions).
Expressing Things
As an example, take the following well-known result in number theory.
Theorem (Bertrand-Chebyshev)
For all n ≥ 2, there is a prime between n and 2n.
Incidentally, this result is of importance for algorithms that need to select
primes of some suitable size: there is a prime between 2k and 2k+1 , so we can
select a prime with any given number of bits (this is just the tip of an iceberg).
How could we state this assertion precisely, as a formal expression that might
be computer-understandable?
5
Pseudo-Formal
6
As a warm-up, here is a pseudo-formal version of the theorem.
for all x
- universal quantifier
2≤x
- binary relation
there exists a y
- existential quantifier
x<y
- binary relation
and
- logical connective
y < 2x
- binary relation
and
- logical connective
y is prime
- unary relation
The assertion “is prime” in the last line needs to be unraveled in a similar way.
Formal
7
Using the standard syntax that you are already familiar with,
Bertrand-Chebyshev can now be written like so:
Near the Abyss
8
Note that it is easy to modify this kind of formula to express much more
problematic assertions:
∀ x ≥ 2 ∃ y (x < y ∧ y < 2 ∗ x ∧ prime(y))
∀ x ∃ y (x < y ∧ prime(y) ∧ prime(y + 2))
prime(y) ≡ y > 1 ∧ ∀ u, v (u ∗ v = y ⇒ u = 1 ∨ v = 1)
This means that there are infinitely many prime-twins, a famous and wide open
problem in number theory.
Clearly machine-readable and not too bad for humans, either.
Designing a Language
9
This and similar examples suggest a language comprised of the following pieces:
constants
that denote individual objects,
variables
that range over individual objects,
relation symbols
that denote relations,
function symbols
that denote functions,
logical connectives
“and,” “or,” “not,” “implies,” . . .
existential quantifiers
that express “there exists,”
universal quantifiers
that express “for all.”
Notation
a, b, c, . . . for constants,
x, y, z, . . . for variables,
∨, ∧, ¬, ⇒ for the logical connectives,
∃ for the existential quantifier,
The Range of Variables
Note that our number theory examples from above tacitly assume that we
interpret the variables as ranging over the integers.
If we were dealing with the rationals or reals, the definition of primality would,
of course, no longer make much sense.
A formula alone is not enought, we also need to determine a domain of
discourse that pins down the range of quantifiers. We’ll see in a moment
precisely how to do this when we address the question of semantics: what
exactly is the meaning of a formula in FOL?
11
Signatures and Arities
One should distinguish more carefully between function and relation symbols of
different arity: nullary, unary, binary and so on.
In most concrete structures only a finite number of function and relation
symbols are needed.
Hence, we can convey the structure of the non-logical part of the language in a
signature or similarity type: a list that indicates the arities of all the function
and relation symbols.
∀ for the universal quantifier,
f , g, h, . . . for function symbols,
R, P , Q, . . . for relation symbols.
Always allow = for equality.
These are just conventions, there are no sacred cows here. E.g., we might also
use subscripted symbols and additional connectives like ⇔ .
10
Example
For group theory one often uses a signature (2, 1, 0): there is one binary
function symbol for the group multiplication, one unary function symbol for the
inverse operation and a nullary function symbol (i.e. a constant) for the neutral
element.
We could also use a language of signature (2) but at the cost of slightly more
complicated axioms.
12
Example: Fields
13
Some Formulae
14
Now consider formulae such as
For the classical algebraic theory of fields one minimally uses a language
∀ x, y (x + y = y + x)
L(+, ·, 0, 1) of signature (2, 2, 0, 0)
∀ x (x · 0 = 0)
∀ x ∃ z ((¬x = 0) ⇒ z · x = 1)
binary function symbols + and · for addition and multiplication,
1+1=0
constants 0 and 1 for the respective neutral elements.
It is intuitively clear what these formulae mean.
Of course, more functions could be added: additive inverse, subtraction,
multiplicative inverse, division. Introducing only addition and multiplication is
the minimalist approach here.
Except for the last one, they are all true in any field. In fact, the first two hold
in any ring.
The third one is true in a field, but is false in an arbitrary ring.
And the last only holds in rings of characteristic 2. In fact, it defines rings of
characteristic 2.
Example: Boolean Algebra
15
Example: Arithmetic
16
For arithmetic it is convenient to have a relation symbol for order:
In Boolean algebra one uses the language
L(+, ·, 0, 1; <) of signature (2, 2, 0, 0; 2)
L(⊔, ⊓, , 0, 1) of signature (2, 2, 1, 0, 0)
So we have
binary function symbols + and · (for integer addition and multiplication)
binary function symbols ⊔ and ⊓ (for join and meet)
constants 0 and 1 (for integers 0 and 1)
unary function symbol (for complement)
a binary relation symbol < (for the less-than relation)
constants 0 and 1
Assuming we are quantifying over the natural numbers we have assertions like
Some formulae:
∀ x ∃ y (x < y)
∀ x, y ∃ z (x ⊔ y = z)
∃ x ∀ y (x = y ∨ x < y)
∃ x ∀ y (x ⊓ y = 0)
∀ x (x + 0 = x)
∀ x ∃ y (x ⊔ y = 1 ∧ x ⊓ y = 0)
More Syntax
While we are not interested in writing a parser or theorem prover, we still need
to be a bit more careful about the syntax of our language of FOL. The
components of a formula can be organized into a taxonomy like so:
variables, constants and terms,
equations,
atomic formulae,
propositional connectives, and
quantifiers.
Every programming language has a defining report (which no one ever reads,
other than perhaps compiler writers, it’s the document that uses the imperative
“shall” a lot), so think of this as the defining report for first-order logic.
Intuitively, these are all true.
17
Graded Alphabet
18
We need a supply of variables Var as well as function symbols and predicate
symbols. These form a graded alphabet Σ = Σ0 ∪ Σ1 where every function
symbol and relation symbol has a fixed number of arguments, determined by a
arity map:
ar : Σ → N
We write L(Σ) for the language constructed from signature Σ.
Function symbols of arity 0 are constants and relation symbols of arity 0 are
Boolean values (true or false) and will always be written ⊤ and ⊥.
In any concrete application, Σ will be finite. However, in the literature you will
often find a big system approach that introduces a countable supply of function
and relation symbols for each arity. For our purposes a signature
custom-designed for a particular application is more helpful.
Syntax: Terms
19
Definition
Syntax: Atomic Formulae
20
Given a few terms, we can apply a predicate to get a basic assertion (like
x + 4 < y).
The set T = T (Σ) = T (Var, Σ) of all terms is defined by
Every variable is a term.
Definition
If f is an n-ary function symbol, and t1 , . . . , tn are terms, n ≥ 0, then
f (t1 , . . . , tn ) is also a term.
An atomic formula is an expression of the form
A ground term is a term that contains no variables.
R(t1 , . . . , tn )
where R is an n-ary relation symbol, and the t1 , . . . , tn are terms.
Note that f () is a term for each constant (0-ary function symbol) f . For
clarity, we write a, b, c and the like for constants.
These are basically the atomic assertions in propositional logic: once we have
values for the variables that might appear in the terms, an atomic formula can
be evaluated to true or false.
The idea is that every ground term corresponds to a specific element in the
underlying structure. For arbitrary terms we first have to replace all variables
by constants. For example, in arithmetic the term (1 + 1 + 1) · (1 + 1)
corresponds to the natural number 6. We could introduce constants for all the
natural numbers, but there is no need to do so: we can build a corresponding
term from the constant 1 and the binary operation +.
Equality
But note that we do need bindings for the variables, R(x, y) per se has no
truth value.
21
Formulae
22
Definition
Equality plays a special role in our setup. We will assume that the language has
a special binary relation symbol = that is used to denote equality:
The set of formulae of FOL is defined by
Every atomic formula is a is a formula.
s=t
If ϕ and ψ are formulae, so are (¬ϕ), (ϕ ∧ ψ), (ϕ ∨ ψ), and (ϕ ⇒ ψ).
where s and t are arbitrary terms has the intended meaning: the two terms
denote the same object.
If ϕ is a formula and x a variable, then (∃ x ϕ) and (∀ x ϕ) are also
formulae.
Unlike with all the other relation symbols, the meaning of = is fixed once and
for all: it is always interpreted as equality.
So these are compound formulae versus the atomic formulae from above.
We will always assume that equality is part of the language, thought FOL
without equality is also of interest. For example, a concrete application may
not use any function symbols whatsoever and only deal with relations.
The ϕ in (∃ x ϕ) and (∀ x ϕ) is called the matrix of the formula.
Note that a term by itself is not a formula: it denotes an element (if it has no
free variables) rather than a truth value.
Preserving Sanity
23
Since we are not compilers, we omit unnecessary parentheses and write
formulae such as
∀ x (R(x) ⇒ ∃ y S(x, y))
The intended meaning is: “For any x, if x has property R, then there exists a y
such that x and y are related by S.”
Binary relation and function symbols are often written in infix notation, using
standard mathematical symbols: As usual, one often uses infix notation for
terms and atomic formulae: x + y < z is easier to read than <(+(x, y), z).
One often contracts quantifiers of the same kind into one block.
∀ x ∀ y ∃ z (= (+(x, z), y))
Sloppy, but eminently readable, style
∀ x, y ∃ z (x + z = y)
Sentences
24
Definition
A variable that is not in the range of a quantifier is free or unbound (as
opposed to bound). A formula without free variables is closed, or a sentence.
One often indicates free variables like so:
ϕ(x, y)
∃ x ϕ(x, y)
∀ y ∃ x ϕ(x, y)
x and y are free
only y is free
closed.
Substitutions of terms for free variables are indicated like so:
ϕ(s, t)
replace x by s, y by t
ϕ[s/x, t/y]
replace x by s, y by t
We will explain shortly how to compute the truth value of a sentence.
Interpreting Free Variables
25
However, taking inspiration from equational logic, it is convenient to define the
truth value a formula with free variables to be the same as its universal closure:
put universal quantifiers in front, one for each free variable.
is somewhat less elegant and harder to read than
Better: rename the x inside
∀ x R(x) ∧ ∃ z ∀ y S(z, y)
x ∗ (y ∗ z) = (x ∗ y) ∗ z
Getting Serious
26
Note that according to our definition it is perfectly agreeable to quantify over
an already bound variable. To make the formula legible one then has to rename
variables. For practical reasons it is best to simply disallow clashes between free
and bound variables.
∀ x R(x) ∧ ∃ x ∀ y S(x, y)
A priori, a formula ϕ(x) with a free variable x has no truth value associated
with it: we need to replace x by a ground term to be able to evaluate.
∀ x, y, z x ∗ (y ∗ z) = (x ∗ y) ∗ z
Clashing Variables
These issues are very similar to problems that arise in programming languages
(global and local variables, scoping issues). They need to be addressed but are
not of central importance to us.
27
Our definition of a formula of FOL is sufficiently precise to reason about the
logic, but it is still a bit vague if we try to actually implement an algorithm that
operates on these formulae.
Exercise
First-Order Logic
Explain how to implement formulae in FOL. Describe the data structure and
make sure that it can be manipulated in a reasonable way.
2
Exercise
Tarski and Truth
Give a precise definition of free and bound variables by induction on the
buildup of the formula.
Exercise
Model Checking
Implement a renaming algorithm that removes potential scoping clashes in the
variables of a formula.
Exercise
Implement a substituting algorithm that replaces a free variable in a formula by
a term.
What is Truth?
How can we explain precisely what it means for an assertion to be true, to be
valid? We would like some sweeping, global definition of truth that handles all
areas of discourse, at least in mathematics and computer science. While this
broad approach corresponds nicely to one’s philosophical assumptions about the
world (at least for an unreconstructed Platonist like myself) it leads to some
rather dicey problems: for example, what should we do with assertions like
“This sentence is false.”
We certainly would want to have the ability to declare certain assertions such
as “100 is a prime number” as false.
How can we then avoid paradoxes as in the self-referential statement above?
29
Tarski’s Theory of Truth
In order to succeed one needs to distinguish carefully between an object
language and a meta-language. It is also helpful to restrict one’s attention to
truth in a particular domain such as, say, group theory or number theory. In the
1930’s Alfred Tarski was the first to seriously tackle to problem of explaining
truth for a sentence in a formal language.
We already have a nice formal language, though so far a formula is just a
syntactic object, think of it as a string or a parse tree. We can now use Tarski’s
ideas to attach meaning to a formula, to establish a relationship with the world
of actual objects such as numbers, functions, hash tables, algorithms, and so
on.
The key is to define the truth value of a formula relative to a given structure, a
mini-universe that allows us to make sense out of the components of the
formula.
30
Semantics, Informally
31
32
The key problem is that the formula ∀ x ∃ y (x < y) itself does not express any
of the auxiliary information:
Here is a simple example from basic arithmetic (over the natural numbers).
Consider the formula
∀ x ∃ y (x < y)
There is no indication that the variables range over natural numbers, and
Its components are
x, y
<
Context Dependence
there is no indication that the binary relation symbol < corresponds to the
standard order relation on the naturals.
variables, range over natural numbers
comparison, the less-than relation
This would become more clear if we had written ∀ x ∃ y R(x, y) instead, the use
of a well-known symbol such as < is just syntactic sugar.
The intended meaning of the formula, expressed in classical math-speak, is then
Moreover, if we were to interpret the variables as ranging over the integers
instead (a relatively minor change) the formula would be false, invalid.
For any natural number x there exists a natural number y such
that x is less than y.
To settle all these issues we have to consider so-called structures over which
the formulae of FOL can be evaluated.
So this formula is clearly true, valid.
Semantics: Structures
33
Abstract Data Types
34
Note that a first order structure is not all that different from a data type. To
wit, we are dealing with a
So what exactly are these structures that we need to interpret a formula in
FOL? Fix some language L = L(Σ) where Σ is a graded alphabet as above.
Definition
collection of objects (values), and
A (first order) structure is a set together with a collection of functions and
relations on that set. The signature of a first order structure is the list of arities
of its functions and relations.
operations on these objects.
The relations can safely be interpreted as operations: a k-ary relation is just a
map R : Ak → B .
In order to interpret formulae in L(Σ) the signatures have to match, which we
will tacitly assume from now on. So a structure in general looks like so:
In the case where the carrier set is finite (actually, finite and small) we can in
fact represent the whole structure by a suitable data structure. For infinite
carrier sets things are a bit more complicated.
A = h A; f1 , f2 , . . . , R1 , R2 , . . . i
The set A is the carrier set of the structure. Unary and binary functions and
relations are by far the most important in applications, but higher arities may
occur.
Interpretations
Data types (or rather, their values) are manipulated in programs, we are here
interested in describing properties of structures using the machinery of FOL.
35
Given a formula and a structure of the same signature we can associate the
function and relation symbols in the formula with real functions and relations in
a structure (of the same arity).
f function symbol
❀
R function symbol
❀
f A a function in A
RA a relation in A
Given this interpretation of function and relation symbols over A we can
determine whether a formula holds true over A.
Example: Arithmetic
36
Consider arithmetic: The language is L(+, ·, 0, 1; <) and has type (2, 2, 0, 0; 2).
We can interpret a formula in this language over any structure of the same
type. Of course, the most important structure for arithmetic is
N = h N; +, ·, 0, 1, < i
the set of natural numbers together with the standard operations but we will
see that there are others.
This is very different from trying to establish universal truth. All we are doing
here is to confirm that, in the context of a particular structure, a certain
formula is valid. As it turns out, this is all that is really needed in the real world.
Note the slight abuse of notation here (as is standard practice). More precise
would be to write: The function symbol + is interpreted by +N , the standard
operation of addition of natural numbers.
Of course, when the structure changes the formula may well become false.
For our purposes, there is no gain in being quite so careful.
Some Arithmetic Formulae
37
The Reals
38
With this interpretation over N, the formula
Suppose we interpret our formulae over the structure of the real numbers
instead. This is possible since it has same signature (2, 2, 0, 0; 2):
0 < 1 is true
∀ x ∃ y (x < y) is true
R = h R; +, ·, 0, 1, < i
∀ x, y (x < y ⇒ ∃ z (x < z ∧ z < y)) is false
Now + refers to addition of reals, 0 is the real number zero, and so on. These
operations are much more complicated, but as far as our formula is concerned
all that matters is that they have the right arity.
How about the primality formula
ϕ(x) = 1 < x ∧ ∀ y, z (x = y · z ⇒ y = 1 ∨ z = 1)
Over the reals, the density formula
This formula has a free variable x, so we need to bind x (replace it by a term)
before we can determine truth.
∀ x, y (x < y ⇒ ∃ z (x < z ∧ z < y)
If we write n = 1 + 1 + . . . + 1 for the term representing n, the set of primes is
|
{z
}
just
holds.
ntimes
On the other hand, the primality statement ϕ(x) is not interesting over R:
there is no binding for x that makes it true.
{ n | ϕ(n) holds in N }.
Validity, Informally
39
We can now give a first informal definition of truth or validity.
Some Valid Formulae
40
It is clear that a formula like ϕ ⇒ ϕ is valid. In fact, if we replace the
propositional variables in any tautology by arbitrary sentences, we obtain a
valid sentence of FOL. More precisely, let
Definition
A formula of first-order logic is valid if it holds over any structure of the
appropriate signature.
ϕ(p1 , p2 , . . . , pn )
For free variables this definition is motivated by formulae such as x ∗ y = y ∗ x:
we want this to mean that the operation ∗ is commutative. Note, though, that
we apply the same approach to function and relation symbols (other than
equality): any possible interpretation has to be taken into account
be a tautology with propositional variables p1 , p2 , . . . , pn . Let ψ1 , . . . , ψn be
arbitrary sentences of FOL. Then
ϕ(ψ1 , ψ2 , . . . , ψn )
Of course, we can pin down some of the properties of the corresponding
functions and relations in the formula itself, but that’s all we can do. There is
no magic that says that the symbol + must always be interpreted as some sort
of addition over some algebraic structure.
Satisfiability, Informally
is a valid sentence of FOL. True, but not too interesting.
41
Again in analogy to propositional logic we can define satisfiability.
Quantifier Manipulations
Slightly more complicated examples for valid formulae are
Definition
∀ x ∀ y ϕ(x, y) ⇒ ∀ y ∀ x ϕ(x, y)
A formula of FOL is satisfiable if it is true for some interpretation of the
variables, functions and relations. It is a contradiction if it is true for no
interpretation of the variables, functions and relations.
∃ x ∃ y ϕ(x, y) ⇒ ∃ y ∃ x ϕ(x, y)
∃ x ∀ y ϕ(x, y) ⇒ ∀ y ∃ x ϕ(x, y)
Note, though, that the following is not valid:
∀ x ∃ y ϕ(x, y) ⇒ ∃ y ∀ x ϕ(x, y)
For example, in the language of binary relations the formula
∀ x, y, z x R x ∧ (x R y ∧ y R z ⇒ x R z)
is satisfied by any structure A that carries a reflexive transitive relation RA .
On the other hand,
∀ x (x 6= c)
where c is a constant, is a contradiction: we can interpret x as the element in
the structure denoted by c in which case equality holds.
Exercise
Verify that the first three formulae are true and come up with an example that
shows that the last one is not.
42
Counting
43
Infinity
44
How about a formula that states that there are infinitely many elements?
Another good source of valid formulae are assertions about the number of
elements in the underlying structure. How do we say “there are exactly n
elements in the ground set” in FOL? First, a formula which states that there
are at most n elements.
EX≥1 ∧ EX≥2 ∧ . . . ∧ EX≥n ∧ . . .
EX≤n = ∃ x1 , . . . , xn ∀ y (y = x1 ∨ . . . ∨ y = xn )
does not work since it is not a finite formula. The entirely plausible attempt
Second, a formula which states that there are at least n elements.
∀ n EX≥n
EX≥n = ∃ x1 , . . . , xn (x1 6= x2 ∧ x1 6= x3 ∧ . . . ∧ xn−1 6= xn )
All these formulae are clearly satisfiable. The conjunction EX≤n ∧ EX≥n pins
down the cardinality to exactly n. Also, a formula
also fails; we cannot quantify over formulae in our logic. But note that there
are more powerful logics where this kind of thing is within bounds.
EX≥n ⇒ EX≥m
Exercise
is valid whenever m ≤ n.
A Trick
Explain precisely why these “formulae” are not admissible in FOL.
45
And the Entscheidungsproblem
As always, we are interested in computational aspects, in particular natural
decision problems associated with this logic.
After some more fruitless attempts, one might suspect that the statement
“there are infinitely many thingies” cannot be expressed in FOL.
Wrong! Let f be a unary function symbol and c a constant. Following
Dedekind, consider
ϕ ≡ ∀ x (f (x) 6= c) ∧ ∀ x, y (f (x) = f (y) ⇒ x = y).
So ϕ states that f is not surjective but injective. Hence in any interpretation
that makes ϕ true the carrier set must be infinite.
Use a total order to produce another formulae in FOL that forces the ground
set to be infinite.
Time to give a precise definition of validity and satisfiability. We begin by
defining assignments in the context of FOL.
Problem:
Instance:
Question:
Validity
A FOL formula ϕ.
Is ϕ valid?
Problem:
Instance:
Question:
Satisfiability
A FOL formula ϕ.
Is ϕ satisfiable?
There is also a search version of Satisfiability: construct a satisfying
interpretation, if one exists. Note that this may well entail building an infinite
structure.
Exercise
Defining Validity
As one might suspect from the few examples, these problems are much harder
in FOL than in propositional logic and will turn out to be highly undecidable in
general.
47
Atomic Formulae
48
Once we have an assignment for all the free variables in an atomic formula, we
can determine a truth value for it.
Definition
Definition
An assignment or valuation (over a structure A) associates variables of the
language with elements in the ground set A.
Given an assignment σ : Var → A , we can associate an element σ(t) in A with
each term t.
Let σ be an assignment over a structure A and ϕ = R(t1 , . . . , tn ) an atomic
formula. Define the truth value of ϕ (under σ over A) to be
(
tt if RA (σ(t1 ), . . . , σ(tn )) holds,
Aσ (ϕ) =
ff
otherwise.
t = x: then σ(t) = σ(x)
Example
t = f (r1 , . . . , rn ): then σ(t) = f A (σ(r1 ), . . . , σ(rn ))
Over the natural numbers N suppose σ(x) = 0 and σ(y) = 1. Then
Example
Over N let σ(x) = 3. Then σ(x · (1 + 1))) = 6 whereas σ(x) = 0 produces
σ(x · (1 + 1))) = 0.
46
Nσ (x + y < 1 + 1) = Nσ (0 + 1 < 1 + 1) = tt
but for σ(x) = σ(y) = 1 we get
Nσ (x + y < 1 + 1) = Nσ (1 + 1 < 1 + 1) = ff
Logical Connectives
49
Pedantic Example
50
Consider the language of arithmetic.
Once we have a truth value for atomic formulae, we can extend this evaluation
to compound formulae without quantifiers.
Suppose we have a valuation σ(x) = 0 and σ(y) = 1.
Then
Definition (Connectives)
Nσ (x < y ∨ y < x) = Hor (Nσ (x < y), Nσ (y < x))
ϕ = ψ ∧ χ: then Aσ (ϕ) = Hand (Aσ (ψ), Aσ (χ))
= Hor (N(0 < 1), N(1 < 0))
ϕ = ¬ψ: then Aσ (ϕ) = Hnot (Aσ (ψ))
= tt
= Hor (tt, ff)
ϕ = ψ ∨ χ: then Aσ (ϕ) = Hor (Aσ (ψ), Aσ (χ))
Given a valuation for the free variables and a database for N it is
straightforward to compute the truth value of quantifier-free formulae in
arithmetic.
Here Hand denotes the Boolean function 2 × 2 → 2 corresponding to logical
“and,” and likewise for the other H functions.
These are just finite lookup tables.
Of course, quantifiers cause a few problems here and there . . .
Quantifiers
51
Validity and Models
52
For an assignment σ, let us write σ[a/x] for the assignment that is the same as
σ everywhere, except that σ[a/x](x) = a. Think of this as substituting “a for
x”.
Definition (Validity)
Definition (Quantifiers)
A formula ϕ is valid in A if it is valid in A for all assignments σ. The structure
A is then said to be a model for ϕ or to satisfy ϕ.
A formula ϕ is valid in A under assignment σ if Aσ (ϕ) = 1.
A sentence is valid (or true) if it is valid over any structure (of the appropriate
signature).
ϕ = ∃ x ψ:
Then Aσ (ϕ) = tt if there is an a in A such that Aσ[a/x] (ϕ) = tt.
ϕ = ∀ x ψ:
Then Aσ (ϕ) = tt if for all a in A Aσ[a/x] (ϕ) = tt.
Notation:
A |=σ ϕ,
Note the condition for validity: the formula has to hold in all structures.
53
Definition (Satisfiability)
A formula is satisfiable if there is some structure A and some assignment σ for
all the free variables in ϕ such that A |=σ ϕ.
|= ϕ
One also says that A is a model for ϕ. The same notation is used for sets of
formulae Γ. So A |= Γ (A is a model of Γ) means that A |= ϕ for all ϕ ∈ Γ.
Note that σ only needs to be defined on the free variables of ϕ to produce a
truth value for ϕ, the values anywhere else do not matter. If ϕ is a sentence, σ
can be totally undefined.
Satisfiability
A |= ϕ,
Isn’t this all circular?
The definitions of truth given here is due to A. Tarski (two seminal papers, one
in 1933 and a second one in 1956, with R. Vaught).
A frequent objection to this approach is that we are using “for all” to define
what a universal quantifier means.
In other words, the existentially quantified formula
∃ x1 , . . . , xn ϕ(x1 , . . . , xn )
has a model, where x1 , . . . , xn are all the free variables of ϕ.
In shorthand: ∃ x ϕ(x).
This is analogous to validity where we insist that ∀ x1 , . . . , xn ϕ(x1 , . . . , xn )
holds, or ∀ x ϕ(x), in compact notation.
True, but the formulae in our logic are syntactic objects, and define their
meaning in terms of structures, which are not syntactic, they are real (in the
world of mathematics and TCS).
Think of this as a program: you can “compute” the truth value of a formula,
as long as you can perform certain operations in the structure (evaluate f A ,
loop over all elements, search over all elements, . . . ). This is a bit problematic
over infinite structures, but for finite ones we can actually perform the
computation (at least if we ignore efficiency).
54
Computing Truth
55
More precisely, suppose we wanted to construct an algorithm
56
Theorem (A. Church 1936, A. Turing 1937)
Validity of first-order logic is undecidable.
ValidQ(A, ϕ)
that checks if formula ϕ is valid over structure A.
Strictly speaking this depends on the underlying language but a single binary
predicate symbol plus equality is enough to force undecidability.
What would be the appropriate input for such an algorithm? The easy part is
the formula: any standard representation will be fine.
Certain weak fragments of FOL are decidable (e.g., only unary predicate
symbols and no function symbols; restrictions on quantifiers also work).
But the structure A causes problems, at least in the important case where A is
infinite. We need some reasonable representation.
Semantic Consequence
Undecidability of FOL
But in general, for sufficiently rich languages and complicated formulae, validity
is simply undecidable.
57
Definition
Proofs
58
Gödel showed that there is a proof system for FOL: one can set up a class of
logical axioms and certain rules of inference that make it possible to perform
deductions. One writes
Given a set Γ of sentences and a sentence ϕ we say that ϕ is a semantic
consequence of Γ if any structure A that satisfies all formulae in Γ also satisfies
ϕ.
Γ ⊢ ϕ
Notation:
Γ |= ϕ
for syntactic consequence: there is a proof of ϕ with assumptions Γ (some set
of formulae).
In particular for Γ = ∅ we get validity: the formula holds in all structures of the
right type.
The details are not interesting to us, but note one simple fact: every proof in
the sense of Gödel is a simple, finite, combinatorial object (just a sequence of
formulae built according to simple rules, can be coded as an integer).
This is written |= ϕ.
Proofs versus Truth
Here are the core requirements for any reasonable proof system: anything that’s
provable is true, anything that’s true is provable and proofs are checkable.
59
Gödel’s Completeness Theorem
Theorem (Completeness Theorem, Gödel 1929)
First-order logic is sound, complete and effective.
Soundness: Γ ⊢ ϕ implies Γ |= ϕ
Completeness: Γ |= ϕ implies Γ ⊢ ϕ
Effectiveness: The set of all proofs is decidable.
Unsound systems or systems with undecidable proofs are essentially useless.
Incomplete systems might still be interesting provided they capture an
interesting and large part of all theorems in Γ. Completeness comes at a cost:
one needs to incorporate enough proof power, which may slow down algorithms.
Or, more succinctly:
Γ |= ϕ ⇐⇒ Γ ⊢ ϕ
At first glance, this may seem nearly impossible: the left hand side is ostensibly
much more complicated than the right; it involves the scary set-theoretic world
of all first-order structures.
60
Proofs and Semidecidability
61
Gödel’s Incompleteness Theorem
62
Completeness shows that proofs in FOL are very powerful, they do capture
truth in a sense.
As we mentioned, proofs can be enumerated systematically. Without going into
details, Gödel-style proofs are just sequences of formulae and they require no
insight into the deeper “meaning of the argument.”
Alas, only in a sense.
The details are somewhat technical, but it turns out that every step in the
proof system is fairly trivial from a computational point of view (easily primitive
recursive). So we can discover proofs by a brute-force, exhaustive search.
For Gödel also showed that there are no “complete” proof systems even for
relatively weak fragments of mathematics: there are always true assertions in
the system that cannot be proven in the system.
It follows that provability in FOL is semidecidable: we can perform an
(unbounded) search for a proof. Alas, there is no computable bound on the
length of a proof.
For example, a standard system of arithmetic like Peano arithmetic (PA) cannot
prove a statement that says, in essence, that the system itself is consistent.
With completeness this also shows that validity is semidecidable.
This is not a problem that can be fixed by changing the system; e.g., if we add
a formerly unprovable statement as a new axiom other gaps open up.
Compactness
63
A Weird Model for PA
64
If we think of PA as pinning down axiomatically the properties of the natural
numbers, here is some very bad news: N is not the only model of PA. To see
this, introduce a new constant c and add the following new axioms to PA:
The key to understanding how completeness and incompleteness can coexist
peacefully is yet another fundamental result that is referred to as compactness.
It follows quite easily from completeness.
0 < c, S(0) < c, S 2 (0) < c, . . . , S n (0) < c, . . .
Theorem (Compactness Theorem, Gödel 1930)
Call the new set of axioms PAI, as in pain, without the n.
A set Γ of sentences has a model if, and only if, every finite subset of Γ has a
model.
Claim
PAI has a model.
This may sound utterly unimpressive, but it opens the floodgate to the
construction of very weird structures: we can build entirely unintended models
of various axiom systems.
Proof. Exploit compactness. Any finite subset Γ of PAI contains only finitely
many of the new axioms. So there is a largest n such that S n (0) < c ∈ Γ.
But then we can simply interpret c as n + 1 in the standard model N, done.
Non-Standard Models
65
So let A be a model of PAI. Since A is in particular a model of PA we have a
full copy of N inside A: there are distinct elements for 0, S(0), S(S(0)) and so
forth.
But, this structure also contains an “infinitely large” element cA : since c is a
constant in the language it is interpreted by some element of A, and it follows
from the extra axioms that
A |= S n (0) < c
for all n ≥ 0.
Aside: Independence in Mathematics
The careful explanation of truth in terms of structures and formulae also helps
to explain the underlying foundational problems that make certain questions
very hard.
Take Cantor’s attempt to pin down the cardinality of the reals, and in
particular to show the Continuum Hypothesis (|R| = ℵ1 ).
As we now know, there are two models A and B of set theory such that
A |= (CH) is false
B |= (CH) is true
A
Of course, c is not infinitely large from the perspective of A, it’s just a
“natural number” there. As are cA + 1, cA − 1, cA + cA , (cA )2 and so on.
✷
So it makes little sense to talk about the truth of (CH); we have to bake the
answer into our axioms.
66
Deciding Truth
First-Order Logic
Tarski and Truth
3
68
Validity is a (the?) key problem for first-order logic. Here is another decision
problem that is hugely important:
Problem:
Instance:
Question:
Model Checking
Model Checking
A FO structure C and a FO sentence ϕ.
Is ϕ valid over C?
Note that if we are trying to find algorithms for this problem we will need to be
able to specify the structure C as part of the input. Obviously this is going to
be a bit tricky. At least for finite structures there is no problem, though.
Theories
69
Variant: Fixed Structure
70
It can be very interesting to solve the model checking problem even when the
structure C is fixed. For fixed structures one often speaks of the expression
complexity of the general problem.
Model checking is CS terminology, in standard mathematical logic one usually
speaks about the first-order theory of a structure.
Th(C) = { ϕ | C |= ϕ, ϕ sentence }
Problem:
Instance:
Question:
Model checking then comes down to determining whether ϕ ∈ Th(C).
Since formulae are just syntactic objects that can easily be coded as natural
numbers Th(C) might well be, say, decidable. This requires that the underlying
language is countable; an entirely reasonable assumption.
Expression Model Checking
A FO sentence ϕ.
Is ϕ valid over C?
E.g., we could try (and fail miserably) to decide arithmetic, i.e. which formulae
are true over the fixed structure N. This would provide answers to questions
like the infinitude of prime twins or the Goldbach conjecture. Even the
Riemann conjecture could be handled this way.
In other words, it’s utterly hopeless.
A Horror
Just to be clear, not only is Th(N) undecidable, we need an oracle of type ∅(ω)
to deal with it.
More precisely, the compute power needed to determine the truth of an
arithmetical formula depends on the complexity of the quantifiers. In particular
to deal with Σk formulae, one needs an oracle like ∅(k) .
And, just a single existential quantifier is already enough to produce the halting
problem.
71
The Reals
Somewhat counterintuitively, replacing the rationals by the reals (an
uncountable structure, no less) makes things easier: now the theory is
decidable by a famous result by Tarski.
Theorem (A. Tarski, 1948)
The theory of the reals with addition and multiplication is decidable.
As a consequence, basic geometry is decidable.
Interestingly, replacing integers by rationals does not help:
Theorem (J. Robinson, 1948)
The theory of the rationals with addition and multiplication is undecidable.
The theorem is proved by a very interesting technique that provides a direct
decision algorithm: quantifier elimination.
Tarski’s original method was highly inefficient, though (not bounded by a stack
of exponentials). There are better methods now, but the complexity is provably
doubly exponential.
72
Variant: Fixed Formula
73
Example: Reversibility
74
Cellular automata are a now classical example of discrete dynamical systems.
One can think of them as structures of the form
Here is another variant of model checking that may seem slightly less natural:
the formula is fixed and the structure varies.
C=
Data Model Checking
A FO structure C.
Is ϕ valid over C?
Problem:
Instance:
Question:
D
2Z , f
E
where f : 2Z → 2Z operates on bi-infinite sequences of bits, but is described by
a finite lookup table 2w → 2 (a Boolean function).
The following formula naturally expresses the fact that such a system in
reversible:
Usually the class of structures under consideration is fairly narrow. For
example, one might want to know whether a cellular automaton is reversible.
∀ x, y (f (x) = f (y) ⇒ x = y)
In CS one often speaks of data complexity in connection with this variant.
From a computational perspective, the underlying space is a horror. But we
still can ask simple questions about it and compute the correct answer.
Specifying Structures
Data model checking here comes down to trying to understand (some part of)
the structure of a particular cellular automaton. This can be interesting even
when w = 3, so the whole system is described by a single ternary Boolean
function. E.g., there is a width 3, binary cellular automaton that is
computationally universal.
For the general model checking problem we need, as part of the input,
structures
A = h A; f1 , f2 , . . . , R1 , R2 , . . . i
So we have to specify a set as well as a collection of functions and relations on
this set.
Expression model checking, on the other hand, is used to classify cellular
automata according to simple criteria. For example, we could determine which
systems are reversible.
This is quite tricky; obviously we have to move away from considering purely
set-theoretic objects, at the very least we should have some nice finitary
description of the structure.
Incidentally, this problem can be solved in quadratic time (in the size of the
table for f ).
Finite Structures
For example, the structure suitable for (elementary) arithmetic, N, certainly
has a nice description.
77
Complexity
Here is a sledge-hammer constraint: consider only finite structures.
In this case we can write down all the functions and relations over A as explicit
tables. For example, here is a multiplication table for a finite group over
A = {a, b, c, d, e, f, g, h}
e
a
b
c
d
f
g
h
e
e
a
b
c
d
f
g
h
a
a
c
e
b
f
h
d
g
b
b
e
c
a
g
d
h
f
c
c
b
a
e
h
g
f
d
Exercise
Determine the properties of this group.
d
d
f
g
h
e
a
b
c
f
f
h
d
g
a
c
e
b
g
g
d
h
f
b
e
c
a
h
h
g
f
d
c
b
a
e
76
In principle, it is not difficult to determine the validity of a sentence ϕ over a
finite structure; essentially one can just implement Tarski’s definition of truth.
Testing truth of the matrix of a formula, given bindings for all the quantified
variables, really comes down to repeated table lookup. To deal with quantifiers
we use loops ranging over the finite universe. So the algorithm is certainly
primitive recursive.
In fact, a closer look shows that PSPACE is already enough. Alas, that is
where things end: even when A = {0, 1} and we only have the standard
Boolean operations validity checking for FOL formula is PSPACE-complete
(this problem is often called QBF: quantified Boolean formulae).
78
Computable Structures
A much less radical idea is to consider structures where all components are
computable. This leads to computable model theory (or “recursive model
theory” in classical parlance). Note that any computable structure is by
necessity countable, but this is not a major concern for us. Alas, as N shows, a
(trivially) computable structure can still have a hugely complicated theory.
Presburger arithmetic uses the language L(+, −, 0, 1; <) of signature
(2, 2, 0, 0; 2), informally this is arithmetic without multiplication.
79
And Finite State Machines?
Here is a Crazy IdeaTM : how about structures that can be described by finite
state machines?
The carrier set would be a regular set of words, but that is not a particularly
drastic constraint (in particular since one can consider infinite words).
We need to do a bit of groundwork first, though. In particular, we have to
explain how a finite state machine can deal with functions and relations.
Theorem (M. Presburger, 1929)
Presburger arithmetic is decidable.
We’ll see a proof next week. Note that the algorithm is triple exponential. In
fact, even for quantifier-free formulae the problem is NP-hard.
Actually, we will get around functions by simply assuming there are no function
symbols in our language. This is not a big restriction, we can always translate a
function into its graph, a relation.
80

Download Report

First-Order Logic - Carnegie Mellon School of Computer Science

Paperzz.com

Your Paperzz