TELEOLOGICAL ARGUMENTS AND THEORY

DRAFT
Giovanni Sartor
CIRSFID, University of Bologna
[email protected]
TELEOLOGICAL ARGUMENTS AND THEORYBASED DIALECTICS
ABSTRACT
This paper proposes to model legal reasoning as dialectical theory-construction directed by teleology.
Precedents are viewed as evidence to be explained through theories. So, given a background of factors and
values, the parties in a case can build their theories by using a set of operators, which are called theory
constructors. The objective of each party is to provide theories that both explain the evidence (the
precedents) and support the decision wished by that party. This leads to theory-based argumentation, i.e., a
dialectical exchange of competing theories, which support opposed outcomes by explaining the same
evidence and appealing to the same values. The winner is the party that can reply with a more coherent
theory to all theories of its adversary.
1. INTRODUCTION
The contribution to teleological argumentation of Berman & Hafner (1993) provides a major insight into
legal argument: rules and cases, abstracted from the purposes that they serve, cannot provide us with an
adequate (computational) model of legal reasoning (cf. Bench-Capon 20001).
We will here try to develop this insight in the framework of a model of legal argumentation that differs
from most analyses so far proposed within AI & law. In those analyses (cf. for all, Gordon 1995), the
argumentation process is viewed as consisting in the exchange of arguments, i.e. of inferences supporting or
attacking contested propositions. In such process, victory goes to the party proposing the strongest
argument, possibly within certain procedural constraints. Here, on the contrary, we view argumentation as
being the process through which parties exchange theories, i.e. alternative comprehensive accounts of a
controversial domain. Victory goes that to the party which succeeds in providing the most coherent theory.
Arguments (inferences) still figure in this account, since the implications of each theory will be established
according to an argument logic. This logic, however, only provides semantics for the theories put forward
by the parties in the dispute; it is not a model for the interaction of the parties.
2. THE EXAMPLE
The benchmark for our approach will be represented by the cases discussed in Berman & Hafner (1993),
which are synthesised as follows by Bench-Capon (2000):
In the first, Pierson v Post, the plaintiff was hunting a fox in the traditional manner using horse and hound
when the defendant killed and carried off the fox. The plaintiff was held to have no right to the fox because he
had gained no possession of it. In the second case, Keeble v Hickeringill, the plaintiff owned a pond and made
his living by luring wild ducks there with decoys and shooting them. Out of malice the defendant used guns to
scare the ducks away from the pond. Here the plaintiff won. In a third case, Young v Hitchens, both parties
were commercial fisherman. While the plaintiff was closing his nets, the defendant sped into the gap, spread
his own net and caught the fish. In this case the defendant won.
1 Footnote: this contribution originated as a comment on Bench-Capon (2000), which the author sent me before its submission. I
thank Trevor Bench-Capon, Carole Hafner, Henry Prakken, and Andrew Stranieri for many helpful remarks on earlier drafts of the
present paper. This paper has been followed by two contributions co-authored with Trevor Bench-Capon (Bench-Capon & Sartor
2001a, Bench-Capon & Sartor 2001b) to which I refer the reader for developments and refinements of some ideas here presented and
for references to related work, in particular, for a discussion of the connections with the HYPO and CATO projects.
Giovanni Sartor
2
In all those cases, the plaintiff π was chasing an animal. The defendant δ intervened stopping the chase, so
defeating the objective of π. π is arguing for the conclusion that he has a legal remedy against δ, while δ is
arguing that no such remedy exists. Berman & Hafner (1993) consider how the decision in Young v
Hitchens can be justified on the basis of the previous decisions in Pierson v Post and in Keeble v
Hickeringill. They agree with Ashley (1990) in focusing on factors, i.e. those (abstract) features of a case
that may possibly influence its outcome. However, they argue that understanding a case-law domain requires
going beyond factors, and looking at the underlying values. Correspondingly, our formalisation will be
based upon the identification of both factors and values. Here are abbreviations for the factors we will
consider:
πLiv = π was pursuing his livelihood
πLand = π was on his own land
πNposs = π was not in possession of the animal
δLiv = δ was pursuing his livelihood.
The values are the following:
LLit = Less Litigation
MProd = More productivity
MSec = More security of possession.
3. THE EVIDENCE
We will now introduce the formal framework we will use to deal with the example above. Let us first
characterise the starting point of the theory-construction exercise of the debating parties. This is the socalled explanandum (Hempel 1966, 51), i.e. the evidence that the competing theories are trying to explain.
Here we assume that evidence is constituted by a set of precedents, where each precedent is characterised by
a set of factors (the possibly relevant features of the case), and by an outcome (the judicial decision in that
case):
In the example above, we have two alternative outcomes, Π for the plaintiff and ∆ for the defendant,
where Π means “π has a legal remedy against δ”, and ∆ means “π has no legal remedy against δ”. As we
have seen above, in Parson, where π had no possession of the animal, the outcome was ∆; in Keeble, where
Π was pursuing his livelihood on his own land, and had no possession of the animal, the outcome was Π.
Therefore the explanandum is represented as follows:
Pierson = {Factors: πNposs. Outcome: ∆}
Keeble = {Factors: πLiv, πLand, πNposs. Outcome: Π}.
4. THE BACKGROUND KNOWLEDGE
Besides possessing certain evidence, the parties are here assumed to share a certain background knowledge,
which includes two components.
The first is what we call a “factor-background”. A factor-background specifies what outcomes are
supported by what factor. This, as in Prakken & Sartor (1998), will be represented through a set of rules (no
special connotation is linked here to the term “rule”: other words expressing a conditional connection, such
as “link” or “warrant”, could be used synonymously). Each rule links a (possibly conjunctive) factor α to the
outcome γ supported by that factor. Such a rule may be understood as defeasible conditional α ⇒ γ, which
we read as “α is a reason for the outcome γ”. For example, πLiv ⇒ Π means “that π was pursuing his
livelihood is a reason why π should have a legal remedy against δ”, or more simply, “if π was pursuing his
livelihood, then π has a legal remedy against δ”.
The consequent of a rule may not be the final result that π or δ are aiming to establish, but it may also
be an intermediate outcome that contributes to establishing the final result (this is considered in BenchCapon & Sartor 2001b). Such a consequent may also consist in affirming the inapplicability of another rule
Giovanni Sartor
3
(undercutting it), i.e. in affirming that under certain conditions a certain factor does not support its
conclusion. However, here we assume for simplicity that the factor-background only consists of rules
establishing one of the two ultimate outcomes Π or ∆:
πLiv ⇒ Π
πland ⇒ Π
πNposs ⇒ ∆
δLiv ⇒ ∆.
The second element of the background knowledge is what we call a “value-background”. Here we will
consider a value-background that only consists of what we call “teleological links”. A teleological link
involves two elements, a rule α ⇒ γ and a value V, can be understood as the conjunction of two assertions:
•
•
the goal V is a (legal) value, i.e. an objective which is pursued by the legal system,
the general adoption of the rule α ⇒ γ (its being used by legal agents as a standard for their reasoning
and practice) would advance the achievement of V.
Let us simplify this double assertion as “α ⇒ γ promotes V”. A value-background may include many other
components, such as a specification of the relative importance of the values, of their relations (achieving
some values may impact positively or negatively on others), etc. Here, however, the value-background will
be limited to elementary teleological links (for simplicity’s sake, we assume that single values do not
interfere with each other, and that all rules promoting the same value do that to the same degree):
πLiv ⇒ Π promotes MProd
πland ⇒ Π promotes MSec
πNposs ⇒ ∆ promotes LLit
δLiv ⇒ ∆ promotes MProd.
Teleological links do not need to concern just one value: their general form is “R promotes {V1, …, Vn}”,
where {V1, …, Vn} is the set of the values advanced by the rule R. However, we take the liberty of omitting
brackets, when single values are involved, as in the examples above.
5. THE TASK OF THE PARTIES
The two parties, π and δ are meeting in the framework of a new case; let us call it Current Situation (CS).
CS also is characterised by a set of factors, but its outcome is not determined (or it is determined, but we
consider it as undetermined, since want to use CS as a test for our theory, i.e. to see if our theory can foresee
its outcome). In the example, the new situation is represented by:
Young = {Factors: πLiv, πNposs, δLiv. Outcome: ?}.
π will try to provide a theoretical hypothesis, i.e. a set of sentences (an explanans, in the terminology of
Hempel 1966, 51) that both explains all precedents (the explanandum) and gives Young the outcome that is
desired by π  (i.e., the decision Π). Let as call such theoretical hypotheses, π-theories. The replies of δ will
consist in alternative theoretical hypotheses, the δ-theories, which still explain all precedents, but imply in
Young the outcome desired by δ (the decision ∆).
6. THEORY CONSTRUCTORS
Besides sharing background knowledge, parties will also share some basic strategies or heuristics for theory
construction. One may also view those heuristics also as patterns for analogical inference, rather than as
ways of providing new content to a theory (cf. Prakken 2000). However, it is useful to distinguish inferences
Giovanni Sartor
4
made in the theory construction phase, from those less problematic inferences performed on the basis of the
constructed theory. To mark this difference we shall call the first inferences “theory constructors”.
The first theory constructor, which we call factor-merging, consists in building more complex rules on
the basis of simpler ones. The idea is that by joining factors supporting the same conclusion, we obtain a
stronger factor pointing to that same conclusion. This can be viewed as a rudimentary formalisation of the
so-called “a fortiori” argument. In other words, from any two rules
1. α ⇒ γ
2. β ⇒ γ
one can construct the following:
1. α & β ⇒ γ
2. α & β ⇒ γ > α ⇒ γ
3. α & β ⇒ γ > β ⇒ γ.
The second theory constructor, which we call value-merging, consists in building more complex teleological
links from simpler ones: if two rules (having the same consequent) promote different values, then the new
rule obtained by merging those rules promotes all those values (the union of those sets). In other words,
from any two teleological links:
1. α ⇒ γ promotes V1, and
2. β ⇒ γ promotes V2
one can construct the following:
α & β ⇒ γ promotes V1 ∪ V2.
Value-merging is complemented by an ordering over sets of values. Here we adopt a minimal approach to
ordering, which we call value-ordering: any set of values is more important than any of its proper subsets.
According to value-ordering, given any sets of values V1 and V2, we can add to any theory the statement (V1
∪V2) > V 1 or the statement (V 1 ∪ V2) > V2.
The third theory constructor, which we call rule-preference-from-value-preference, consists in
introducing preferences between rules on the basis of preferences between values. The assumption is that
rules promoting more important values are stronger than those promoting less important values. More
precisely, given that a theory contains
1. V1 > V2,
2. R1 promotes V1, and
3. R2 promotes V2,
where V1 and V2 are the sets of all values respectively promoted by R1 and R2 , one can expand the theory
with the new preference:
R1 > R2.
The fourth theory constructor, which we call rule-broadening, consists in introducing a more general rule on
the basis of a more specific one, already contained in the theory. So, if the theory contains α & β ⇒ γ, one
may expand it with α ⇒ γ or β ⇒ γ. This aspect is emphasised by Bench-Capon (1999), who represents
cases as graphs, where each rule is linked to the broadenings derivable from it.
The fifth theory constructor, which we call rule-preference-from-case, consists in introducing
preferences between rules when these preferences contribute to explaining the precedents. More precisely,
given that
1. a theory T does not explain precedents C1, …, Cn and
Giovanni Sartor
5
2. {R1 > R2, …, Rj > Rk} is a minimal set of preferences such that T ∪ {R1 > R2, …, Rj > Rk} explains
C1, …, Cn,
then we can add to the theory the new preferences
R1> R2, …, Rj > Rk.
This reasoning move can be viewed as a form of abduction, i.e. as the introduction of a hypothesis that is
justified by its ability to explain the evidence (within the available theoretical framework).
Finally, the last theory constructor, which we call arbitrary-rule-preference consists in introducing a
new preference between rules, which is not necessary for explaining precedents, nor obtained from valuepreferences, nor supported by further information in the background knowledge, but is required for
justifying a certain result in CS.
Other similar constructors could also be added to the model, to allow, for example, the introduction of
value preferences on the basis of rule preferences, or the introduction of values as explanatory hypotheses,
but we will not consider them here (see Bench-Capon & Sartor 2001b).
7. LOGIC
The logic determining the semantics (the meaning, or the implications) of the theories put forward by
the parties of the dispute will be a dialectical logic. This is because each party must include in his theory,
besides the reasons (the factors) supporting the conclusion he is aiming at, also the reasons favouring his
adversary. If the party just considered his own reasons, he would be accused of being biased and one-sided,
and would lose to a competitor showing a more bipartisan understanding. On the contrary, each party's task
is that of showing that a decision for his side is implied by the (allegedly) most balanced account of the
controversial domain, i.e. by the account that gives the most thorough and impartial consideration to the
circumstances favouring his adversary. Therefore, each party’s theory will licence inferences for the
adversary (inferences which, on the basis of factors favouring the adversary, conclude that the latter should
win), though the party will claim that these inferences are defeated by prevailing inferences favouring his
side.
From our viewpoint, the theory of one party and its logic will be dialectical in the sense respectively of
including reason, counter-reasons and meta-reasons (reasons for preferring certain reasons to certain others)
and of licensing an architecture of corresponding inferences, rather than in the sense of modelling or
constraining a real dialogue. In particular, though we shall call “arguments” the inferences available within
one theory (as usual in argumentation logics), we do not view these inferences as explicit statements of the
parties of the dispute and, in particular, we do not assume that each party states all and only the inferences
favouring his or her side. Similarly, the mechanism adjudicating the conflicts between such arguments (the
so called “argumentation framework”) is no protocol for a dispute, but only a way of specifying what
conclusions are justified (or implied) by a single theory.
Consequently, we will distinguish on the one hand the dialectical exchange between the two parties, and
on the other hand the dialectical semantics of their theories. The dialectical exchange concerns the
articulation and the refinement of alternative competing theories, while the dialectical semantics, based
upon an argumentation logic, concerns establishing the defeasible implications of each theory. In our
approach, therefore, there is no opposition but rather complementariness between theory construction and
dialectical logic (for a contrary view, cf. McCarty 1997).
Here we use the argumentation logic of Prakken & Sartor (1996), where the reader can find a formal
definition. In the following, we will give a very simplified idea of this logic, which is be sufficient for
making sense of our example. For simplicity, we say that that also factors and preferences (in addition to
rules linking factors to outcomes) are rules: in general, we view an unconditioned statement as a rule with an
empty antecedent (a factor or preference ϕ can be viewed as the abbreviation for the rule ⇒ ϕ). We also
assume that our theories only contain ground formulas (formulas containing variable are substituted with all
ground instances)
The first notion is that of an argument. We say that a finite set of rules A is an argument if any rule φ1 &
… & φn ⇒ ϕ in A is preceded by rules with consequents φ1, …, φn. All consequents of rules in A are
conclusions of A (each of them is derivable from rules in A, by repeatedly applying modus ponens). For
Giovanni Sartor
6
example, in a premises set S1 = {α, β, α ⇒ γ , β ⇒ ¬γ}, B1 = {α, α ⇒ γ} is an argument for γ (and α),
while B2 = {β, β ⇒ ¬ γ} is an argument for ¬ γ (and β). Arguments including rules with conflicting
consequents (α ⇒ γ, β ⇒ ¬ γ ), are said to be each other’s counterarguments (we do not consider here the
possibility of undercutting, on which cf. Prakken & Sartor 1996).
The second notion is that of defeat, which provides a way of adjudicating conflicts between arguments.
Of two counterarguments A1 and A2, including conflicting rules r1 and r2 respectively, we say that A1 defeats
A2 iff it is not the case that r1 < r2, according to A2. To assess the strength of the conflicting rules, we rely on
what the competing arguments say: to avoid being defeated by A1, A2 must conclude for a preference r1 < r2.
When argument A1 defeats argument A2, but A2 does not defeat A1, we say that A1 strongly defeats A2. So,
B1 and B2 above defeat each other, since none of them says anything on the relative strength of the
competing rules α ⇒ γ and β ⇒ ¬γ. On the contrary, B3 = {α, α ⇒ γ, “α ⇒ γ”>“β ⇒ ¬γ”} is not defeated
by B2 (though defeating it), since B3 includes, and therefore supports, preference “α ⇒ γ”>“β ⇒ ¬γ”.
Therefore B3 strongly defeats B2.
Finally, the logic of Prakken & Sartor (1996) provides a division of all arguments (available in certain
a premises set) into three categories, justified, defensible and overruled ones. Only justified arguments have
the capacity of establishing justified conclusions, on the basis of that premises set, i.e. conclusions that are
supported or implied by the information contained in that set. Defensible arguments are the uncertain ones,
which cannot be relied upon, but which still can effectively defeat other arguments, so preventing them from
being justified. Overruled arguments, finally, are useless, been defeated by stronger arguments, which are
justified. We need not consider here how to evaluate arguments (see definition in Prakken & Sartor 1996,
which addresses multi-step arguments and reinstatement), since in our paper we will only consider one step
arguments, and our background knowledge does not allow for conflicting preferences). Under these
conditions, we may simply say that an argument A1 is justified when for each counterargument A2, A1
contains a preference according to which one of its rule is stronger that a rule in A2 and A2 contains no
preference according to which one its rules is stronger then a rule in A1. Obviously this also holds when no
counterarguments are available.
For example, premises set S2 = {α, β, α ⇒ γ , β ⇒ ¬γ, α ⇒ γ > β ⇒ ¬γ}, contains argument B1 =
{α, α ⇒ γ, α ⇒ γ > β ⇒ ¬γ} and argument B2 = {β, β ⇒ ¬γ}. B1 includes the preference α ⇒ γ >
β ⇒ ¬γ, stating that the rule α ⇒ γ of B1, is stronger that than the opposed rule β ⇒ ¬ γ of B2.
Therefore, B1 defeats B2, (without being defeated by it), and emerges as being justified, within S2.
Consequently, γ is a justified consequence of S2, i.e. γ is implied by S2. Note that γ was not implied by S1
above, since according to S1, A1 is no justified argument, being defeated by A2.
The adoption of this dialectical logic allows us to clarify in what sense a theory explains a case. A
theory T explains a case C, with factors α1, …, α n and outcome γ, if the premises set T ∪ { α1, …, α n}
implies γ, i.e., if T ∪ { α1, …, α n} contains a justified argument with conclusion γ. Similarly a theory T
supports a certain outcome γ in the current situation CS (where the current situation is a set of factors) if T
∪ CS implies γ. For example, the theory
T1Π = { 1. πLiv ⇒ Π promotes MProd [from background knowledge (BGK)];
2. πNposs ⇒ ∆ [from BGK];
3. πLiv ⇒ Π > πNposs ⇒ ∆ [explanation-thorough-preferences, in regard to Keeble] }
explains both Pierson (with factors πNposs and outcome ∆), where only the antecedent of the rule
πNposs ⇒ ∆ is satisfied, and Keeble (with factors πLiv, πLand, πNposs and conclusion ∆), where both
rules πLiv ⇒ Π and πNposs ⇒ ∆ are satisfied, but the Π-rule prevails, being stronger, according to T1Π (the
argument is { πLiv, πLiv ⇒ Π, πLiv ⇒ Π > πNposs ⇒ ∆}). It also supports (justifies) Π in Young,
where the same arguments can be built as in Keeble. Note that for a theory to explain a case it is not
necessary that the theory considers all factors in the case. For example, T1Π succeeds in explaining Keeble,
though its rules do not refer to pLand .
Giovanni Sartor
7
8. COHERENCE
As we said above, a dispute consists in a dialectical exchange of theories. Victory goes to the party
providing a theory that is better than any theory of the adversary. The criterion to measure the comparative
strength of competing theories will be the idea of coherence. We will not provide here a precise notion of
coherence, nor an exhaustive one (on coherence, cf. Thagard 2001; on coherence in the law, cf. among
others, Alexy & Peczenik 1990). We will just consider some properties of theories that seem relevant to this
idea in the present domain. A theory is coherent, in regard to a certain set of cases (the evidence) and certain
background knowledge, constituted by rules and teleological links, to the extent that it satisfies the
following criteria:
•
•
•
•
Case-coverage. This consists in the ability of explaining cases. A more coherent theory succeeds in
explaining a larger set of cases: the theory includes justified arguments connecting the factors in the
precedents and the corresponding outcome. In this paper we use set-inclusion as the metric for
comparing explained cases, i.e. A is larger than B iff A ⊃ B. However, weaker criteria are also
compatible with the approach here developed, such as focusing on the cardinality of sets of
explained cases, or on the importance of the cases they contain.
Factor-coverage. This consists in the ability of taking into account factors in the explained cases. A
more coherent theory explains cases by combinations of (justified) arguments and (overruled)
counterarguments that refer to a larger set of factors.
Analogical-connectivity. This consists in the fact that premises in a theory are obtainable through
analogies from other premises in the theory. By analogies we mean here those theory-construction
operators that extract new premises from premises already in the theory, and therefore reflect a
content connection between their result and their preconditions. Here the analogies are provided by
the operators factor-merging, value-merging, rule-preference-from-value-preference, and rulebroadening.
Non-arbitrariness. This consists in the fact that a theory does not contain unsupported premises. A
premise is unsupported if it is not required for explaining the precedents, not is included in the
background knowledge, nor is obtainable through analogies from supported premises.
9. THEORY-BASED DIALECTICS. FACTOR-BASED REASONING
Let us now consider how the parties may proceed in constructing their theories. First we will consider how
they reason with rules, and then how they use teleological links.
Let Π propose theory T1Π above, which as we have seen, explains both Keeble and Pierson, and
justifies Π in Young. T1Π has a weakness (incoherence) in so far as it does not also consider factor πLand in
Young and factor δLiv in Young (it is incoherent under the criterion of factor-coverage). One reply by ∆
may consist in theory T1∆.
T1∆: {
1. πLiv & πLand ⇒ Π [from BGK + factor-merging];
2. πNposs ⇒ ∆ [from BGK];
3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆ [from preference-from-case in regard to Keeble] }.
Theory T1∆ provides a distinction in regard to T1Π. In fact, it substitutes T1Π’s rule πLiv ⇒ Π with the
more specific rule πLiv & πLand ⇒ Π, which is not satisfied in Young. T1∆ still explains Π in Keeble,
but supports ∆ in Young (instead of Π). T1∆ is better than T1Π under factor-coverage, since it considers,
besides factors πLiv and πNposs, also factor πLand , and in particular it provides a more thorough
explanation of Keeble. T1∆ can be countered with the following Π-theory, based upon broadening πLiv &
πLand ⇒ Π into πLiv ⇒ Π:
T2Π: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆;
4. πLiv ⇒ Π [from BGK, and also by rule-broadening from 1];
5. πLiv ⇒ Π > πNposs ⇒ ∆ [arbitrary-rule-preference] }.
Giovanni Sartor
8
T2Π, which justifies Π in Young, is equally coherent as T1∆, as far as factor-coverage is concerned. It is
better under the criterion of analogical connectivity, since it includes both the broadened rule πLiv &
πLand ⇒ Π and its broadening πLiv ⇒ Π. However, it is defective under the criterion of nonarbitrariness, since the preference πLiv ⇒ Π > πNposs ⇒ ∆ is unnecessary for explaining the
precedent: its only use is that of justifying the outcome wished by ∆ in CS. Another possible ∆-theory is the
following.
T2∆: {
Π;
1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒
5. πNposs & δLiv ⇒ ∆ [from BGK + factors-merging];
6. πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π [arbitrary-rule-preference]}.
T2∆ succeeds in explaining both Pierson and Keeble and considers more factors than T2Π does, providing a
more thorough explanation of Young (since it includes also factor δLiv). However, also T2∆ is incoherent for
its arbitrariness, since it includes the preference πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π, which is
unsupported by the evidence (it is unnecessary for explaining the precedents). In fact T2∆ can be countered
by the following T3Π theory, which scores equally well under all coherence criteria (it just includes a
different arbitrary preference):
T3Π: { 1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆;
4. πLiv ⇒ Π;
5. πNposs & δLiv ⇒ ∆;
6. πNposs & δLiv ⇒ ∆ < πLiv ⇒ Π [arbitrary-rule-preference]}.
10. THEORY-BASED DIALECTICS. VALUE-BASED REASONING
As we have just seen, at the level of factor-based reasoning both parties have failed to provide a theory that
is more coherent than the best theory of their adversary. They can remedy the defects of their theories (under
the criteria of factor-coverage and analogical connectivity) only by making those theories defective under
non-arbitrariness. As Berman and Hafner (1993) observed, the dispute may only be decided when moving to
teleological reasoning. Let us now consider the theory T2∆ above:
T2∆: {
1. πLiv & πLand ⇒ Π; 2. πNposs ⇒ ∆; 3. πLiv & πLand ⇒ Π > πNposs ⇒ ∆; 4. πLiv ⇒ Π
5. πNposs & dLiv ⇒ ∆; 6. πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π}
This theory considers all factors, gives ∆ the result she wants, and includes an analogical connection, but is
infected by the arbitrariness of the preference πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π}. This arbitrariness can
be removed by expanding T2∆ with the following value-subtheory:
T2∆V:{ 7. πLiv ⇒ Π promotes MProd [from values-BGK];
8. πNposs & δLiv ⇒ ∆ promotes {MProd, LLit} [from values-BGK + value-merging];
9. {Mprod, LLit} > MProd [from value-ordering]}
T2∆V shows that πNposs & δLiv ⇒ ∆ promotes a larger set of values than πLiv ⇒ Π does (as it stated in
7 and 8), which means that the first rule promotes a more important set of values than the latter does (as
stated in 9). This supports the conclusion that the rule πNposs & δLiv ⇒ ∆ is stronger than its competitor
(according to rule-preference-from-value-preference). This is exactly the preference stated in 6 above,
which can now be given an appropriate support: πNposs & δLiv ⇒ ∆ > πLiv ⇒ Π is supported by
premises 7, 8 and 9, according to the constructor rule-preferences-from-value-preferences.
Arguably, there is no theory that is more coherent than the resulting theory T3∆ (including all lines from
1 to 9) since it:
Giovanni Sartor
•
•
•
•
9
explains all precedents,
considers all factors,
includes analogical connections,
contains no arbitrary assumptions.
Note also that value-subtheory T2∆V, besides integrating Τ2∆ in a coherent whole (T3∆), where support links
connect the two subtheories, also succeeds in undermining the competing theory T2Π : the preference
πNposs & δLiv ⇒ ∆ < πLiv ⇒ Π is not only arbitrary, but also inconsistent with premise 6 of theory
T3∆. The latter premise cannot be easily eliminated since, as we just saw, it is analogically connected to
value propositions that were legitimately derived from the background knowledge.
11. VALUES AND THE EVOLUTION OF CASE LAW
In the model above, one important aspect is missing, i.e., an account of the dynamics of case law, as it
depends on the evolution of the socio-political context. This dynamics seems to undermine the very
possibility of constructing a coherent theory of a case-law domain: how is it possible to fit in a single theory
cases which were decided differently, even in the presence of the same constellations of factors, since
different decisions were required by different contexts?
We will not try to provide a full-fledged discussion, nor a complete formalisation, but only sketch the
essential features of one solution that can be developed in the framework here proposed. The key is in the
notion of “promoting”. Let us recall that “R promotes V” is an ellipsis for the conjunction of two
statements:
1. V is a legal value,
2. the general practice of rule R would advance the achievement of V.
Here we will not consider statement 1, since the discussion of changes in values involves deep and
controversial philosophical issues. Are values objective, conventional or merely subjective? Are they eternal
and universal or relative to particular times and places? On the contrary, the question of whether and how
much certain values are going to be advanced through certain (rule-based) practices concerns an empirical
connection, which undoubtedly is dependent upon changing socio-economical conditions. Even if ultimate
legal values remain unchanged, the ways in which the practice of a specific rule impacts on them may
change over time (a similar change would also concern instrumental values, but we will not consider them
here).
For example, it may be argued that under the circumstances prevailing in modern industrialised
countries, hunting has lost its ancient economic function: rather than contributing to productivity, it may
detract from it. This may be true especially when hunting hinders some forms of recreation (watching wild
animals, hiking, etc.) and so jeopardises the livelihood of those involved in the corresponding economical
activities (hotel personnel, tour operators, tourist guides, etc.) In such a context, the practice of the rule
πHunt ⇒ Π (if a plaintiff is hunting a wild animal than he has a legal remedy against a defendant who
interrupted the chase), by facilitating hunting, does not promote social productivity, but rather impairs it.
Consequently, even though in the past it was right to give a πHunt-case the outcome Π, recently it may have
become right to decide an equal πHunt-case with ∆.
To model this phenomenon, we need to provide theories which are capable of explaining conflicting
decisions, adopted on the basis of the same set of factors, but taken in different times, when the impact of
(the practice of) the rules contemplating those factors on the relevant values has changed (this issue was
generally addressed in Berman & Hafner 1995).
Let me sketch how this may be possible through a slight change in the logic introduced above. Let me
first assume that priority statements have a temporal specification: they say that rule R1 prevails over rule
R2, at time τ1, abridged as R1 >(at τ1) R2. Note that preference is consistent with affirming that R1 <(at τ2) R2.
Correspondingly, we say that an argument A1 defeats(at τ) its counterargument A2, if A2 does conclude for A1
<(at τ) A2. So argument A, from premises set S is justified(at τ) within S, if A is not defeated(at τ) by any
Giovanni Sartor
10
justified(at τ) argument in S. Finally, premises set S implies(at τ) α if α is the conclusion of an argument which
is justified(at τ) within S.
The constructor rule-preference-from-values-preferences also needs to be “temporalised” as follows.
Given that a theory contains the following:
1. V1 > V2;
2. R1 promotes(at τ) V1;
3. R2 promotes(at τ) V2
we can add to it the rule preference:
R1 >(at τ) R2.
Let us further adopt the following temporal axiom TA, which allows for a rudimentary temporal reasoning
(a more sophisticated treatment of temporal notions could obviously be embedded in the model here
proposed):
TA: If R promotes V from τ1 to τ2, and τ is contained in the interval <τ1, τ2>, then R promotes(at τ) V.
Let us consider two cases, where, at different times, the same combination of factors, that is {A, B}, led to
opposite outcomes (O, ¬O):
Cold: Factors: A, B. Outcome: O. Time: 1.1.1950
Cnew: Factors: A, B. Outcome: ¬O. Time: 1.1.2000.
Let the factors-background be:
A⇒O
B ⇒ ¬O.
Let the value-background be:
A ⇒ O promotes V1 from 1.1.1900 to 1.1.1980
B ⇒ ¬O promotes V2 from 1.1.1900 to Now
V1 > V2 .
The task, as above, is to build a theory that succeeds in explaining both Cold and Cnew. To do that we need to
make the notion of an explanation time-sensitive: a theory explains a case if the theory implies the outcome
of the case at the time where the case was decided. More exactly, a theory Τ explains a case C, with factors
α1, …, α n, outcome γ, and time τ, if Τ ∪ { α1, …, α n} implies(at τ) γ, i.e., if γ is a justified(at τ) conclusion of
T ∪ { α1, …, α n}. Let us now consider the following theory T1:
1.
2.
3.
4.
5.
6.
7.
A ⇒ O (from factors-BGK)
B ⇒ ¬O (from factors-BGK)
A ⇒ O promotes V1 from 1.1.1900 to 1.1.1980 [from BGK]
B ⇒ ¬O promotes V2, from 1.1.1900 to Now [from BGK]
V1 > V2 [from BGK]
A ⇒ O > (at 1.1.1950) B ⇒ ¬O [from 3, TA, 5, rule-preference-from-value-preferences]
A ⇒ O < (at 1.1.2000) B ⇒ ¬O [from 4, TA, value-ordering, rule-preference-from-valuepreferences].
T1 ∪ {A, B} both implies(at 1.1.1950) O, and implies(at 1.1.2000) ¬O. This is because argument A1 = {A, A ⇒ O}
strongly defeats(at 1.1.1950) argument A2 = {B, B ⇒ ¬O}, while A2 strongly defeats(at 1.1.2000) A1. On the one
hand, A1 strongly defeats(at 1.1.1950) A2, according to the preference A ⇒ O >(at 1.1.1950) B ⇒ ¬O, which is
Giovanni Sartor
11
derived from the value-preference V1 > V2, given that A ⇒ O promotes(at 1.1.1950) V1, and that B ⇒ ¬O
promotes(at 1.1.1950) V2. On the other hand, A2 strongly defeats(at 2.2.2000) A1, according to the preference A ⇒
O <(at 1.1.1950) B ⇒ ¬O, which is derived form the value-preference ∅ < V2 (value V2 is better than no
value at all), given that A ⇒ O promotes(at 1.1.2000) ∅ (the empty set of values) , while B ⇒ ¬O
promotes(at 1.1.2000) V2. So, as we wanted, T1 succeeds in explaining both Cold and Cnew, although the two cases
provide opposite outcomes for the same combination of factors.
12. CONCLUSION
In this paper we have viewed case-based reasoning as a theory-construction exercise governed by the idea of
coherence. Although the results here presented are very preliminary, I hope that the reader may agree that
our approach can make some sense, at least when applied to the benchmark problem of the combination of
cases, factors and values, originally proposed by Berman & Hafner (1993).
Let us conclude our contribution by pointing to possible developments.
Firstly, one could expand the background knowledge available to the parties, for example, with
information concerning the statements of the judges and the context of their utterance. This would lead to a
further theory-construction profile: the need to make sense of the “history” of the case, and in particular of
the judges’ opinions, in the circumstances where they were stated. So, a case theory, besides a rulesubtheory and a value-subtheory, might also include a “history-subtheory”, to be constructed using the
available data about the case, including, in particular, the expressed opinion of the judges. Such a historysubtheory may support the conclusion that the judges “meant” to decide the case according to certain rules
or preferences. In particular, the fact that the judges explicitly stated a certain principle and gave it a
particular role, in the argumentative structure of their opinion, can lead to the conclusion that they viewed
this principle as the decisive ratio of their judgement in the case. This history-subtheory would provide
coherent support to the rule-subtheories that use the principle in explaining the case. Correspondingly, on
the basis of the history-subtheory, some other rule-subtheories may be excluded, as being incoherent with
the case history. Also a different result however, may be possible, under appropriate circumstances: it may
argued that a certain explanation of a precedent, based upon the current value-subtheory, makes more sense
than the explanation based upon the expressed opinion of the judges, and consequently dismiss the latter
(this phenomenon was described by Smith and Deedman 1987, who provide real world examples).
Secondly, one can develop the idea of the circularity of justificatory links. Here we have assumed static
background knowledge, and therefore a one-way theory-construction process, which goes from the
background knowledge to the theory of the cases. However one can consider that the background knowledge
itself (or at least some parts of it) needs to be constructed according to the theory of the cases. According to
this approach, the coherence test will not concern the case theory only, but rather the whole combination of
case theory + background knowledge, seen in their interdependence. This combination would then compete
against alternative similar combinations.
Thirdly, a more sophisticated account could be provided of the ordering between values (e.g., ways of
determining preferences between single values, and between certain quantities of them), of the relations
among values (e.g., specifying how the satisfaction of one value can contribute to, or detract from, the
satisfaction of others) and of the connection between rules and values (e.g., assign a strength to this
connection according to a probabilistic metric). Such an account could in particular profit from the
contribution of decision theory, which has traditionally investigated ends-means connections.
Fourthly, the perfect symmetry we have here assumed in the position of the parties can be substituted
with criteria for allocating of the burden of proof (or, more generally, the burden of argumentation). For
example, an obvious adaptation would consist in assuming that while the plaintiff must provide a theory
justifying the outcome he wants for CS, the defendant only needs to provide a theory that does not justify
the plaintiff’s outcome. In this approach, the defendant would satisfy her burden of proof just by providing a
theory that leaves the outcome in CS indeterminate. In this regard, theories of the burden of proof, as
developed by Prakken (2001) could provide useful models.
Finally, the relations between the various dimensions of coherence above considered (and further
aspects of it) should be explored, to see how the scores a theory achieves along those different dimensions
can be combined into an overall mark (on computing coherence, cf. Thagard 1992). In this connection, one
may inquire when some coherence requirements may be waived or limited. For example, all theories here
considered were assumed to cover all cases (to score the maximum under the criterion of case-coverage), but
Giovanni Sartor
12
it would be more realistic to assume that some cases could be explained away as being deviant or simply
wrong. This would need to be linked to a view of the development of case-based law that uses notions such
as those of express and implied overruling.
13. BIBLIOGRAPHY
Alexy, R., & A. Peczenik. 1990. The Concept of Coherence and Its Significance for Discursive Rationality.
Ratio Juris 3: 130-147.
Ashley, K.D. 1990. Modeling Legal Argument: Reasoning with Cases and Hypotheticals. Cambridge
(Massachusetts): MIT.
Bench-Capon, T.J.M. 1999. Some Observations on Modelling Case Based Reasoning with Formal Argument
Models. In Proceedings of the Sixth International Conference on AI and Law, 36-42. New York: ACM
Press.
Bench-Capon, T.J.M. 2000. The Missing Link Revisited: The Role of Teleology in Representing Legal
Argument. In this Special Issue.
Bench-Capon, T.J.M., & G. Sartor. 2001a. Using Values and Theories To Resolve Disagreement in Law. In
Proceedings of the The Thirteenth Annual Conference on Legal Knowledge and Information Systems
JURIX 2000. Ed. J. Breuker, R. Leenes and R. Winkels, 73-84. IOS Press: Amsterdam.
Bench-Capon, T.J.M., & G. Sartor. 2001b. Theory Based Explanation of Case Law Domains. In
Proceedings of the Eighth International Conference on Artificial Intelligence and Law, 12-21. ACM:
New York.
Berman, D.H., & C.D. Hafner. 1993. Representing Teleological Structure in Case Based Reasoning: The
Missing Link. In Proceedings of the Fourth International Conference on AI and Law, 50-59. New
York: ACM Press.
Gordon, T.F. 1995. The Pleadings Game. An Artificial Intelligence Model of Procedural Justice. Dordrecht:
Kluwer.
Hempel, C.G. 1966. Philosophy of Natural Sciences. Englewood Cliffs (NJ): Prentice-Hall.
McCarty, L.T. 1997. Some arguments about legal arguments. In Proceedings of the Sixth International
Conference on Artificial Intelligence and Law, 215-224. New York: ACM Press.
Prakken, H.. 2000. An Exercise in Formalising Teleological Case Based Reasoning. In J. Breuker, R. Leenes
and R. Winkels (eds), Legal Knowledge and Information Systems: Jurix 2000, 49-57. Amsterdam: IOS
Press.
Prakken, H. 2001. Modelling Reasoning about Evidence in Legal Procedure. In Proceedings of the Eighth
International Conference on Artificial Intelligence and Law, 119-128. New York: ACM Press.
Prakken, H. & G. Sartor.1997. Rules about Rules: Assessing Conflicting Arguments in Legal Reasoning.
Artificial Intelligence and Law 4: 331-368.
Prakken, H. & G. Sartor. 1998. Modelling Reasoning with Precedents in a Formal Dialogue Game. Artificial
Intelligence and Law 6: 231-287.
Thagard, Π. 1992. Conceptual Revolutions. Princeton(NJ): Princeton University Press.
Thagard, Π. 2001. Coherence in Thought and Action. Cambridge (MA): MIT Press.