Plausibility Measures: A General Approach for Representing

Plausibility Measures: A General
Approach for Representing
Uncertainty
Joseph Halpern
Cornell University
Some of this work joint with Nir Friedman;
some joint with Francis Chu
Plausibility Measures: A General Approach for Representing Uncertainty – p. 1/33
Representing Uncertainty
How should we represent uncertainty?
The standard approach is probability:
Pros: precise, refined tradeoffs, well-understood foundations,
applications in decision theory.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 2/33
Representing Uncertainty
How should we represent uncertainty?
The standard approach is probability:
Pros: precise, refined tradeoffs, well-understood foundations,
applications in decision theory.
Cons: too precise, forces linear order, can be computationally
expensive
Plausibility Measures: A General Approach for Representing Uncertainty – p. 2/33
The Alternatives . . .
Many alternatives to probability have been suggested:
qualitative probability
non-additive probability
lexicographic probability system (lps)
sets of probability measures
(Dempster-Shafer) belief functions
possibility theory and fuzzy logics
κ-rankings
...
Plausibility Measures: A General Approach for Representing Uncertainty – p. 3/33
A Unified Framework
Plausibility measures provide a single unified framework for studying
representations of uncertainty. They allow us to
capture all previously studied approaches
prove general results about representations of uncertainty
get a deeper understanding of what makes a representation “good”
provide semantics for
belief and belief change
counterfactual reasoning
preferential (default) reasoning
provide a general approach to decision theory
Plausibility Measures: A General Approach for Representing Uncertainty – p. 4/33
Plausibility Measures
A probability space is a tuple (W, F , Pr), where
W is a set of worlds,
F is an algebra of measurable subsets of W (a set of subsets
closed under finite union and complementation), and
Pr is a probability measure, mapping sets in F to [0, 1].
A plausibility space is a direct generalization:
Replace Pr by a plausibility measure Pl, which maps sets in F to
elements of some arbitrary set D partially ordered by ≤.
Assume D contains special elements ⊥ and ⊤ such that
⊥ ≤ d ≤ ⊤, Pl(∅) = ⊥, and Pl(W ) = ⊤.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 5/33
Conditions on Plausibility
Pl1.
Pl(W ) = ⊤.
Pl2.
Pl(∅) = ⊥.
Pl3. If U
⊆ V , then Pl(U ) ≤ Pl(V ).
These are minimal requirements for a representation of likelihood.
Lack of structure makes everything a plausibility measure.
Having few requirements allows us to impose additional
requirements as needed.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 6/33
Default Reasoning
Defaults are statements of the form ϕ
→ψ
read “if ϕ then typically ψ ”.
Defaults appear often in natural language.
An (in)famous example:
Bird → Fly
Penguin →
¬Fly
Penguin → Bird
Default reasoning is not monotonic:
If A
→ C , then we may not have A ∧ B → C
Nevertheless, it seems that it should obey certain properties.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 7/33
The KLM Properties
While there is no consensus regarding the properties of defaults, , there
is an agreement on a common “core” of default reasoning.
Kraus, Lehmann and Magidor suggest a set of properties (the
“KLM properties”) that characterize this core.
≡ ϕ′ , then from ϕ → ψ infer ϕ′ → ψ .
Always infer ϕ → ϕ.
If ψ ⇒ ψ ′ , then from ϕ → ψ infer ϕ → ψ ′ .
From ϕ → ψ1 and ϕ → ψ2 infer ϕ → ψ1 ∧ ψ2 .
If ϕ1 → ψ and ϕ2 → ψ infer ϕ1 ∨ ϕ2 → ψ .
From ϕ → ψ1 and ϕ → ψ2 infer ϕ ∧ ψ1 → ψ2 .
(LLE) If ϕ
(REF)
(RW)
(AND)
(OR)
(CM)
Plausibility Measures: A General Approach for Representing Uncertainty – p. 8/33
Giving Semantics to Defaults
Many (quite different) semantics for default reasoning are characterized
by the KLM properties.
Plausibility structures help explain why.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 9/33
Preferential Structures
A preferential structure [KLM] is a tuple
P = (W, ≺, π), where π maps each world in W to a truth assignment,
and ≺ is a partial order on W .
x ≺ y denotes that x is more preferred than y .
For now, we assume that ≺ is well-founded.
May have two different worlds associated with the same truth
assignment (π(w)
= π(w ′ ) for w 6= w ′ )
A preferential structure provides semantics for defaults:
P |= ϕ → ψ if the most plausible worlds (i.e., minimal according
to ≺) that satisfy ϕ also satisfy ψ .
Plausibility Measures: A General Approach for Representing Uncertainty – p. 10/33
Preferential Entailment
Lehmann and Magidor define entailment w.r.t. to a class C of
preferential structures:.
∆ entails ϕ → ψ w.r.t. C if every P ∈ C that satisfies all the
defaults in ∆ also satisfies ϕ → ψ .
Theorem: [KLM] The KLM properties provide a sound and complete
axiomatization of entailment w.r.t. preferential structures.
∆ entails ϕ → ψ w.r.t. all preferential structures iff ϕ → ψ can be
derived from ∆ using the KLM properties.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 11/33
Defaults as Extreme Probability
An alternative semantics of defaults is in terms of extreme probabilities
[Adams; Pearl].
Trying to capture the intuition that ϕ
→ ψ means Pr(ψ|ϕ) ≈ 1.
But how do we capture “≈ 1”?
A parametrized probability distribution (PPD) is a sequence (Prn ) of
probability distributions on some space W .
(W, (Prn ), π) |= ϕ → ψ if lim Prn (ψ | ϕ) = 1
n→∞
Intuitively, this means that (according to (Prn )) the probability of the
default is arbitrarily close to 1.
Note:
Pr(ϕ) = Pr({w ∈ W : w |= ϕ}).
Plausibility Measures: A General Approach for Representing Uncertainty – p. 12/33
ǫ-Entailment
∆ ǫ-entails a default ϕ → ψ if
for every family (Prn ), if (Prn )
|= ∆ then (Prn ) |= ϕ → ψ .
as the confidence in ∆ goes to 1, so does the confidence in
ϕ → ψ.
Theorem: [Geffner] The KLM properties provide a sound and complete
axiomatization of entailment w.r.t. ǫ-entailment.
It is remarkable that two totally different interpretations of
defaults yield identical sets of conclusions and identical sets
of reasoning machinery.
– J. Pearl, 1989
Plausibility Measures: A General Approach for Representing Uncertainty – p. 13/33
Other Semantics for Defaults
Possibility measures [Zadeh; Dubois/Prade]. These are functions
that assign to each event a degree of possibility between 0 and 1.
P(∅) = 0, P(W ) = 1, P(A ∪ B) = max(P(A), P(B)).
(W, P, π) |= ϕ → ψ if P(ϕ) = 0 or P(ϕ ∧ ψ) > P(ϕ ∧ ¬ψ).
κ-rankings [Spohn, Goldszmidt/Pearl]. These are functions that
assign to each event an ordinal that measures degree of surprise.
κ(W ) = 0, κ(∅) = ∞, κ(A ∪ B) = min(κ(A), κ(B)).
(W, κ, π) |= ϕ → ψ if κ(ϕ) = ∞ or κ(ϕ ∧ ψ) < κ(ϕ ∧ ¬ψ).
The KLM properties again characterize entailment for possibility
measures and κ-rankings.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 14/33
Defaults in Plausibility Structures
Why do we always get the KLM properties?
A plausibility structure is a tuple P L
= (W, Pl, π), where (W, Pl) is a
plausibility space.
P L |= ϕ → ψ if Pl(ϕ) = ⊥ or Pl(ϕ ∧ ψ) > Pl(ϕ ∧ ¬ψ).
This is just the obvious generalization of the definition used for
possibility and κ-ranking.
if Pl is a probability function Pr, then
(W, Pr, π) |= ϕ → ψ iff Pr(ϕ) = 0 or Pr(ψ|ϕ) > 1/2.
Key observation: We can map preferential structures and PPDs into
plausibility structures while still preserving the semantics of defaults.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 15/33
From Preference to Plausibility
Given a partial order ≺ on W , we define a plausibility measure Pl≺ on
W with the following properties:
1. If x
≺ y , then Pl≺ ({y}) < Pl≺ ({x}).
2. If Pl≺ ({a})
3.
≤ Pl≺ (B) for all a ∈ A, then Pl≺ (A) ≤ Pl≺ (B).
Pl≺ is the least plausibility measure satisfying 1 and 2.
We can think of Pl≺ as extending ≺ so that it is an ordering on sets,
not just on elements.
Theorem:
(W, ≺, π) |= ϕ → ψ iff (W, Pl≺ , π) |= ϕ → ψ .
Plausibility Measures: A General Approach for Representing Uncertainty – p. 16/33
From ǫ-Semantics to Plausibility
Given a PPD P R
= (Pr1 , Pr2 , . . .) defined on W , we can define a
plausibility measure PlP R on W such that
PlP R (A) ≤ PlP R (B) iff limn→∞ Prn (B|A ∪ B) = 1.
Intuitively, PlP R (A)
Theorem:
≤ PlP R (B) iff B is “almost all” of A ∪ B .
(W, P R, π) |= ϕ → ψ iff (W, PlP R , π) |= ϕ → ψ .
Plausibility Measures: A General Approach for Representing Uncertainty – p. 17/33
KLM and Plausibility Structures
It is easy to see that every plausibility structure satisfies LLE, RW, and
REF. In general, they do not satisfy AND, OR, or CM.
Example: It is easy to construct a probability function such that
Pr(q1 |p) > 1/2, Pr(q2 |p) > 1/2, and Pr(q1 ∧ q2 |p) < 1/2.
What does it take to make a plausibility measure satisfy the KLM
properties?
Plausibility Measures: A General Approach for Representing Uncertainty – p. 18/33
Getting the AND Rule
To get the AND rule, reverse engineering shows we need
A2′ . For all sets A, B1 , and B2 , if
Pl(A ∩ B1 ) > Pl(A ∩ B1 ) and
Pl(A ∩ B2 ) > Pl(A ∩ B2 )
then Pl(A ∩ B1
Proposition:
∩ B2 ) > Pl(A ∩ (B1 ∩ B2 )).
Pl satisfies A2′ iff Pl satisfies
A2. For all pairwise disjoint sets A, B , and C , if
Pl(A ∪ B) > Pl(C) and
Pl(A ∪ C) > Pl(B),
then Pl(A) > Pl(B ∪ C).
Plausibility Measures: A General Approach for Representing Uncertainty – p. 19/33
Qualitative Plausibility Structures
We could reverse engineer rules for CM and OR, but we don’t need to.
CM follows from A1 and A2
OR almost does too
|= ϕ1 → ψ , Pl |= ϕ2 , and either
Pl(ϕ1 ) 6= ⊥ or Pl(ϕ2 ) 6= ⊥, then Pl |= (ϕ1 ∨ ϕ2 ) → ψ .
If Pl satisfies A1 and A2, Pl
To get the full OR rule, we need
A3. If Pl(A)
= Pl(B) = ⊥, then Pl(A ∪ B) = ⊥.
A plausibility structure (W, Pl, π) is qualitative if Pl satisfies A2 and
A3. Let P QPL consist of all qualitative plausibility structures.
Theorem: If P
⊆ P QPL , then the KLM properties are sound in P .
Plausibility Measures: A General Approach for Representing Uncertainty – p. 20/33
Why we always get KLM
Theorem: The following sets of plausibility are all qualitative:
P κ : κ rankings
P P oss : possibility measures
P P : preferential structures (using the mapping)
P r : modular preferential structures
P ǫ : PPDs (using the mapping)
Proof: E.g., to see that possibility measures satisfy A2, suppose
A, B , C disjoint; P(A ∪ B) > P(C); P(A ∪ C) > P(B).
Then elements of maximum possibility in A ∪ B
Conclusion:
∪ C must all be in A.
P(A) > P(B ∪ C).
Plausibility Measures: A General Approach for Representing Uncertainty – p. 21/33
Why we don’t get more than KLM
⊆ P QP L , the KLM properties are sound in P . If we take
“too small” a subset of P QP L , we may get extra properties.
As long as P
Def: A set P is rich if for every sequence of mutually exclusive formulas
ϕ1 , . . . , ϕn , there is a plausibility structure (W, Pl, π) ∈ P such that
Pl(ϕ1 ) > · · · > Pl(ϕn ) = Pl(∅).
Theorem: A set P of qualitative plausibility structures is rich iff the KLM
properties are sound and complete w.r.t. P .
Theorem: Each of P P oss , P κ , P ǫ , P p , and P r is rich.
Bottom line: A2, A3 + richness is what we need for the KLM
properties to characterize default reasoning.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 22/33
Counterfactual Reasoning
Counterfactual reasoning is used all the time in day-to-day reasoning:
If he had only stopped at the stop sign, he would not have had the
accident
If Oswald had not shot Kennedy, Kennedy would still be alive today.
It is also critical in analyzing causality
Even if it weren’t raining and I wasn’t drunk, I wouldn’t have had the
accident if the brakes weren’t faulty.
Therefore the faulty brakes caused the accident.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 23/33
Semantics for Counterfactuals
Read ϕ
≻ ψ as “if ϕ were true, then ψ would be true”.
ϕ ≻ ψ is true at a world w if, at the
closest worlds to w where ϕ is true, ψ is also true.
Idea (Lewis, Stalnaker):
ϕ ≻ ψ is true at w if ϕ is highly plausible given ψ ,
according to the plausibility measure used at w .
Alternate idea:
Now need a possibly different plausibility measure Plw for each
world w .
But what properties should Plw have?
That depends on the desired properties of counterfactuals.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 24/33
Properties of Counterfactuals
C0.
(ϕ ∧ ϕ ≻ ψ) ⇒ ψ .
C1.
ϕ ≻ ϕ.
C2.
((ϕ ≻ ψ1 ) ∧ (ϕ ≻ ψ2 )) ⇒ (ϕ ≻ (ψ1 ∧ ψ2 )).
C3.
((ϕ1 ≻ ψ) ∧ (ϕ2 ≻ ψ)) ⇒ ((ϕ1 ∨ ϕ2 ) ≻ ψ).
C4.
((ϕ1 ≻ ϕ2 ) ∧ (ϕ1 ≻ ψ)) ⇒ ((ϕ1 ∧ ϕ2 ) ≻ ψ).
C5. If ψ1
⇒ ψ2 , then (ϕ ≻ ψ1 ) ⇒ (ϕ ≻ ψ2 )
C6. If ϕ1
⇔ ϕ2 then ϕ1 ≻ ψ ⇔ ϕ2 ≻ ψ .
C1–6 correspond to REF, AND, OR, and CM.
To capture them, we need to assume that Plw is qualitative
(satisfies A2 and A3).
Plausibility Measures: A General Approach for Representing Uncertainty – p. 25/33
Capturing C0
C0.
(ϕ ∧ ϕ ≻ ψ) ⇒ ψ .
C0 essentially says that the current world is the most plausible
according to Plw .
w is the “closest” world to itself
This is easy to formalize:
If w
∈ A and w ∈
/ B , then Plw (A) > Plw (B).
Bottom line: We can capture counterfactual reasoning using
plausibility measures:
This leads to different intuitions regarding counterfactuals.
It also leads to simpler proofs of completeness.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 26/33
Belief Change
How should we revise our beliefs in the light of unexpected information.
Probabiliistically, this amounts to conditioning on an event of
measure 0
Alchourrón, Gärdenfors, and Makinson initiated a study of this topic,
proposing various properties that revision should satisfy.
Since then, many approaches to belief revision have been
proposed
Again, all can be understood in terms of conditioning plausibility
measures.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 27/33
Decision Theory
The standard rule for decision making is maximizing expected utility
(meu). But there are many others.
minimax
Choose the action that gives the best worst-case payoff
regret minimization
...
Plausibility provides a general framework for defining decision rules.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 28/33
Standard Notation
There is a set A of possible actions
For each action a
∈ A, there is an associated utility:
ua (w): the utility of performing action a in world w
Plausibility Measures: A General Approach for Representing Uncertainty – p. 29/33
Plausibilistic Expectation
Assume that there are two partially ordered sets
D – the set of plausibility values
D′ – the set of utility values
⊕ : D′ × D′ → D′
⊗ : D′ × D → D′
Given plausibility measure Pl with range D , define expectation the
obvious way:
EPl (X) = ⊕x∈X x ⊗ Pl(X = x)
Using this definition, can define expected utility maximization with
respect to a plausibility measure in the obvious way.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 30/33
Generalized expected utility
Every decision rule can be viewed as an instance of maximing expected
utility (with respect to an appropriate plausibility measure and utility):
Theorem: Given an arbitrary preference order > on actions, there is a
plausibility measure Pl, utility u, ⊕, and ⊗ such that a
> a′ iff
EPl (ua ) > EPl (ua′ ).
Bottom line: whatever the decision rule, we can always think in terms
of maximizing expected (plausibilistic) utility
Plausibility Measures: A General Approach for Representing Uncertainty – p. 31/33
Example: capturing minimax
Take D
= {0, 1} and D′ = IR.
Let Plmm (U )
= 1 if U 6= ∅; Plmm (∅) = 0.
Let ⊕ be min and let ⊗
= ×.
EPlmm (ua ) is the utility of worst-case outcome when performing a
minimax = choosing the action a that maximizes EPlmm (ua )
a has higher worst-case utility than a′ iff
EPlmm (ua ) > EPlmm (ua′ ).
Plausibility Measures: A General Approach for Representing Uncertainty – p. 32/33
Summary
Plausibility provides a general framework for developing a theory of
representing uncertainty:
Allows us to understand what is needed to capture (conditional)
belief, counterfactual reasoning, default reasoning, belief change,
and decision making.
Can also examine conditioning and independence in a very
general setting.
Key step to providing compact representations of plausibility
measures using Bayesian networks
By using plausibility, we can “engineer” an appropriate
representation of uncertainty.
Plausibility Measures: A General Approach for Representing Uncertainty – p. 33/33