DRAFT – do not cite
INCORPORATION OF DECISION SUPPORT
SYSTEMS IN JUDICIAL DECISION-MAKING
– SOME PRELIMINARY QUESTIONS
Keren Yalin-Mor
INTRODUCTION
Decision support systems (DSSs) for judges are intended to guide and help the
judges sitting in court perform a better decision-making process, while leaving them
the final decision-making authority. These are computerized systems, designed
especially for a specific judicial question and activated by the judges (or by a person
acting on their behalf). They receive an input, which is the relevant facts of the case,
and deliver an output, which may be the judicial outcome as a whole, or the suggested
answer to a specific question needed in order to grant a decision.
There are three main forms of architectures for DSSs.1 The first one is
BASED SYSTEMS,
RULE-
which translate a certain field into a tree of "if-then" rules. When the
system receives an input it follows the rules and produces an output. The second is
CASE-BASED SYSTEMS,
which include a database of previous cases in the field. Such a
system receives a case, finds similar case(s), and calculates the outcome based on the
similarities and the differences between the input and the retrieved case(s). The third
is
MACHINE-LEARNING SYSTEMS,
which have the ability to learn and improve from
case to case. The most common learning system is comprised of artificial neural
networks, which try to resemble the nervous system of the brain. By using many self-
1
This observation is based upon Ruth Kannai, Uri Schild & John Zeleznikow, Modeling the Evolution
of Legal Discretion – An Artificial Intelligence Approach, 20 RATIO JURIS 530 (2007), mainly pp. 53639. Of course, different problems are better modeled by different techniques.
1
DRAFT – do not cite
adjusting processing elements, such a system exploits a large dataset to "learn"2 and is
then able to apply the weights assigned to each relevant factor to a new case and
achieve an outcome.3
For reasons of clarity I will now give one brief example of a DSS designed in
the field of Australian family law. The SPLIT UP system, combines a rule-based
component with artificial neural networks, and is intended to assist the division of
property during divorce trials.4 This system applies a set of rules in order to determine
the marital assets pool, and uses a series of small artificial neural networks in order to
determine the percentage of the common pool each party should receive. The task of
dividing the assets according to the determined percentage falls to the human judge.
The prototype and various aspects of Split Up was presented in a series of papers, and
its incorporation in the judicial system was discussed, but ultimately the system was
transfigured into an alternative dispute resolution (ADR) system, and was later
incorporated into more advanced versions of ADR systems.
While in the research I discuss a few issues regarding the incorporation of
DSSs in judicial decision making, such as their influence on argumentation of the
decisions, in this short paper focus on three preliminary issues relevant for the
research. First, I explain why I refrain from discussing the general question whether
incorporation of DSSs in adjudication has positive outcomes, and "jump" to specific
questions regarding such incorporation. Second, I explain different possible goals of
DSSs: functioning as an ideal judge, as an average judge, or as complementary to
2
The facts of each case are given as an input to the system, which suggests an outcome based on the
knowledge it gained, and then corrected to the right outcome.
3
An important characteristic of artificial neural networks is that they cannot supply justification for the
outcome; see Dan Hunter, Out of Their Minds: Legal Theory and Neural Networks, 7 ARTIFICIAL
INTELLIGENCE & L. 129, 143 (1999).
4
Andrew Stranieri, John Zeleznikow, Mark Gawler & Bryn Lewis, A Hybrid Rule – Neural Approach
for the Automation of Legal Reasoning in the Discretionary Domain of Family Law in Australia, 7
ARTIFICIAL INTELLIGENCE & L. 153 (1999).
2
DRAFT – do not cite
human judges, and explain the correct goal in my view. Third, I discuss three schemes
of evaluation of legal DSSs with regard to the question how to determine whether the
DSS reaches correct outcomes.
GENERAL QUESTION OR SPECIFIC QUESTION
The issue of incorporation of decision support systems in the judiciary
generally involves two types of questions. The first is the general question, whether
decision support systems should be incorporated in adjudication. To answer this
question we should define what benefits DSSs offer for judicial decision making,
what are their drawbacks, and whether – in the overall balance – this is a positive
thing, taking into consideration social values.
The second type of questions includes various pragmatic questions needed to
be discussed in order for such incorporation to be conducted in the best way. By this I
refer to questions such as which types of DSSs are appropriate, what affect DSSs will
have on argumentation, how can one appeal on a decision reached by using a DSS,
under what conditions a judge is required (or allowed) to deviate from the system's
output, and more.
On the face of things, in order to reach the second type of questions we first
have to overcome the barrier of the first question – to decide that DSSs should be
incorporated in the judiciary, since the benefits deriving from this step outweigh the
drawbacks. After this decision, if taken, we can move to the second step, where all the
pragmatic questions arise and need to be answered. However, in the research I focus
on the pragmatic questions and skip the general question.
In a nutshell, I have four main reasons for this choice: the answer to the
general question might depend on the details of such incorporation; the legal
3
DRAFT – do not cite
community would benefit from preparing in advance for such a possibility instead of
finding itself reacting to technological development; I argue that not all of the effects
of such incorporation can be foreseen in advance and going into details can assist
starting the process; and I believe the discussion by itself contributes to the study of
adjudication on the one hand and of decision support systems on the other hand. In
this short chapter I explain these reasons more broadly.
I
IN ORDER TO DECIDE, WE SHOULD UNDERSTAND THE DETAILS
The benefits and drawbacks of using DSSs in adjudication can be portrayed
using a general idea of what are DSSs and how the different types of DSSs work. On
the one hand, it can be argued that DSSs will enhance efficiency, uniformity and
equality in judicial decision making. On the other hand, DSSs threaten to breach the
right to fair trial of litigants and their right to dignity, to dispense justice, and to affect
the development of legal rules and norms. However, I opine that these general
arguments cannot give a complete answer to the question whether or not to
incorporate DSSs in adjudication, since the answer to this question should depend on
the details of such incorporation.
Of course, there can be a theoretical stance rejecting the involvement of
computerized systems in judicial decision making.5 This approach, leaving the sole
discretion to the human judge will not be interested in the details of DSSs, since
according to this view, every interference with judicial discretion is undesirable.
However, the judicial system already includes some sort of computerized interference,
influencing the decisions. Computerized legal databases are in common use by all
legal community – including judges, and this fact is being accepted without
5
See, e.g., Joseph S. Fulda, Implications of a Logical Paradox for Computer-Dispensed Justice, 8
ARTIFICIAL INTELLIGENCE & SOC'Y 357 (1994) (arguing that society cannot permit decisions made by
"soulless machines" which are not perfect).
4
DRAFT – do not cite
controversy.6 Moreover, sophisticated computer programs are used for a few decades
to assess damages, without much opposition.7 Therefore, I argue that most legal
scholars do not completely object to the incorporation of sophisticated computerized
programs in adjudication. Since there is no fundamental difference between these
types of systems and DSSs,8 I believe that the common stance cannot object to the
mere idea of DSS incorporated in adjudication, but only with regard to specific
characteristics of DSSs.
Consequently, going into details is needed in order to decide whether or not to
incorporate DSSs in adjudication and under which conditions. There are two main
reasons for this argument. First, understanding the details of incorporation of DSS
sheds light on the same questions and topics which arise when discussing the general
question. For example, one argument against using DSSs is that automated systems
breach the right to dignity.9 When discussing the general question, the discussion is
limited to the understanding that such a breach is indeed likely to happen under the
current conception of dignity, and weighing it, as well as other drawbacks of DSSs,
against the advantages.10 In contrast, going into details fills the question of dignity
6
On the effects of legal databases see SUSSKIND, THE END OF LAWYERS? (paperback ed. 2010);
Samuel E. Trosow, The Database and the Fields of Law: Are There New Divisions of Labor? 96 LAW
LIBR. J. 63 (2004-05); Bryna Bogoch, Ruth Halperin-Kaddari & Eyal Katvan, HaPsakim Hasmu'im min
HaAyin: Hashpa'atam shel HaMa'agarim HaMemuchshavim al Yetzirat Guf HaYeda HaMishpati
BeDinei HaMishpacha BeIsrael [Exposing Family Secrets: The Implications of Computerized
Databases for the Creation of Knowledge in Family Law in Israel], 34 IYUNEY MISHPAT [TEL-AVIV U.
L. REV.] 603 (2011) (Isr.).
7
Jack W. Fleming, A Guide to the Use of Computers to Estimate Damages in Complex Litigation, 2
COMPUTER/L.J. 863 (1980). The lack of literature on this matter – much like on the use of legal
databases – implies on the naturalness in which these phenomena are taken, while I believe their
implication on the legal field are broader than usually considered.
8
This argument might be in dispute.
9
On the relations between automated systems and the right to dignity see Sydney Archibald,
Rethinking
Reputation
in
an
Automated
Age,
available
at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2255046. See also article 15 of Directive 95/46/EC
on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free
Movement of Such Data.
10
On the other hand, I opine that DSSs can protect the right to dignity since they can eliminate judicial
biases, and since efficient judicial making process – leading to timely decisions – preserves the dignity
of the litigants.
5
DRAFT – do not cite
with content, and it can be discussed whether different types of DSSs offer less
danger, and therefore their incorporation can be considered. By understanding the
implications of DSSs, a more rational decision regarding the incorporation of DSSs
can be made. Of course, the final conclusion might be that DSSs should not be
incorporated in the judiciary (for example, since they breach the right to dignity in a
non-proportional manner), but this conclusion will be more rational.
Second, going into details will not only help to reveal the specific implications
of DSSs and help the decision-makers to reach a decision regarding the incorporation
of such systems in the judiciary, but can help them state the necessary conditions for
such incorporation. It can be argued that systems following certain conditions are
appropriate for use, while other DSSs are not. If we follow the example regarding
breach of the right to dignity, the discussion can lead to an understanding of what can
be done to minimize that fear – for instance, compelling the judge to employ her
discretion and consider the computerized output, or protecting the right to appeal
against a decision made while using a DSS.
Another example regards the question of argumentation using DSSs. One
might claim that since argumentation is an inherent part of judicial decision making,
artificial neural networks – which cannot supply an explanation for their outcome –
cannot be used in judicial decision making. This is while rule-based systems (which
follow other criteria) are appropriate for use.11 Such a conclusion could have not been
reached without understanding the specific characteristics of different DSSs.
11
The conditions can be even more specific and might vary with regard to different fields of law and to
the different questions discussed in each field. It might be argued, for instance, that while in private law
more weight is given to efficiency and certainty, in criminal law the rights of the litigants have special
significance. Therefore, in criminal law more emphasize will be given to the discretion of the judge and
to the right to appeal on the decision, while regarding to private law more discretion might be given to
the system.
6
DRAFT – do not cite
II
IT IS BETTER TO PREPARE FOR FUTURE DEVELOPMENTS
Whether or not the general discussion can assist the decision makers in
reaching a decision regarding the usefulness of incorporation of DSSs in the judiciary,
I opine that the legal community would benefit from preparing for such possible
incorporation in advance, if at some point a decision regarding incorporation of DSSs
will be made. If the legal community does not prepare well and thoroughly examine
the expected consequences of DSSs' incorporation, it might find itself having to react
to technology, and the possibility to affect the details of the incorporation might be
narrower.
It can be argued that the details of such future incorporation are unknown to us
in this point in time, and that there might be significant technological changes before
such incorporation occurs. Therefore, the argument goes, current discussions are
useless. I believe this argument is wrong, and it is better to start discussing these ideas
with the tools that we have now, and not wait until reality will change and law will be
left behind.12
III
NOT ALL IMPLICATIONS CAN BE EXPECTED IN ADVANCE
According to the law of unintended consequences, incorporation of DSSs in
adjudication will cause chain reactions, which consequences are not foreseeable.13 I
argue that a theoretical discussion cannot include all actual effects of this process and
in many topics will not be based on evidence, but on mere assumptions. For example,
one could argue that usage of DSS might increase public trust in the judicial system,
since decisions will be more transparent and more uniform. On the other hand, public
12
See also, SUSSKIND, THE END OF LAWYERS?, 60 (paperback ed. 2010).
On the law of unintended consequences see Rob Norton, Unintended Consequences, in THE CONCISE
ENCYCLOPEDIA
OF
ECONOMICS,
available
at
http://www.econlib.org/library/Enc/UnintendedConsequences.html.
13
7
DRAFT – do not cite
trust might decrease, since the public prefers human decisions on computerized
decisions. These are two opposite arguments, and it is hard – and maybe impossible –
to determine what the actual effects of DSSs will be.
A documented example of the unexpected consequences of incorporation of
computerized systems can be seen in a case from the Netherlands, where a
computerized system was being used by social security services. A research
evaluating the system reported that many of the errors caused after using the system
were not a result of the system itself, but of clerks' over-reliance on the system, even
when they were required to take part in the decision making process. 14 Of course, this
outcome – which affects involved persons, and possibly public trust – was not
expected when the system was designed and incorporated.
One of the factors which makes it harder to foresee the implications of such
incorporation is the observation that people are acting in many cases "in the shadow
of the law",15 and changing the law has an indirect implication on their actions, even
if they do not bring such actions to court. However, since these implications are
indirect, they are less expected. One prominent example is the issue of certainty of the
legal outcome. According to Mnookin & Kornhauser, if the legal state of affairs is
clear, parties tend to end conflicts in an agreement and refrain from leaving them for
judicial decision.16 Therefore, a possible outcome of using DSSs might be more
certainty – since computers are determinate – and reduction in the number of legal
proceedings. However, this is just an assumption, and even if it is realized, its scope is
also uncertain.
14
Marga M. Groothius & Jorgen S. Svensson, Expert System Support and Juridical Quality, in LEGAL
KNOWLEDGE AND INFORMATION SYSTEMS JURIX 2000 1, 8 (Joost Breuker, Ronald Leenes & Radboud
Winkels eds., 2000).
15
This expression was first introduced in Robert H. Mnookin & Lewis Kornhauser, Bargaining in the
Shadow of the Law: The Case of Divorce, 88 YALE L.J. 950 (1978-1979).
16
Id. at 977-80.
8
DRAFT – do not cite
The conclusion is that in order to understand the implications of incorporation
of DSSs in adjudication, and to ensure the incorporation achieves optimal outcomes,
such incorporation should be made with measured steps, and in each step all the
effects should be evaluated and considered.17 Such process should start by addressing
issues arising now regarding the incorporation.
IV
CONTRIBUTION OF THE DISCUSSION
Discussion regarding incorporation of DSSs in adjudication can lead to
significant insights, even if in the end of the day DSSs will not be incorporated, since
the discussion itself offers an interesting and revealing prism for the advantages and
drawbacks of adjudication as being conducted today, and of human judges. I believe
that going into details and referring to specific technology can shed more light on the
role of judges today, rather than theoretical discussion.
Until today human judges were considered the primary authority for legal
dispute resolution. Since the alternatives were limited, the discussion regarding the
bounds of human judges was limited as well.18 The option of computerized decisionmaking in legal matters opens new possibilities for discussion about the goals of
litigation and judicial-making, even if it was just a thinking exercise.
The novelty of the research flows also from the research's focus on lower
instances and "easy cases", which were not thoroughly researched until now. Most of
the literature regarding adjudication focuses – even if not explicitly – on the higher
17
In smaller scales Agile software development might be used to achieve this goal; see Agile
Manifesto, http://agilemanifesto.org/
18
A prominent alternative for judicial decision making was introduced with the appraisal of alternative
dispute resolution. A radical alternative to the customary decision-making process is random decisions,
such as coin flips; see Gideon Keren & Charles H. Teigen, Decisions by Coin Toss: Inappropriate but
Fair, 5 Judgment and Decision Making 83 (2010). In a U.S. case where a judge decided a case by a
coin
toss
he
was
later
removed
(http://www.abajournal.com/news/article/judge_removed_for_thigh_inspection_coin_flip/). See also
http://www.enquirer.com/editions/2000/04/26/loc_jury_flips_coin_to.html
9
DRAFT – do not cite
instances (mainly on supreme courts) and on the hard cases, and not on lower
instances and easy cases.19 This is while much of the judicial activity occurs in the
lower instances and with regard to easy cases.20 The research will discuss the roles of
judges in lower instances, their goals in adjudication of easy cases, how they perform
these goals in reality, and whether the expectations from them are realistic, and by
that will contribute to filling the gap. Again, I argue that the discussion will be more
enriching when discussing the details of possible DSSs as opposed to a hypothetical
discussion.
The contribution of the discussion is not limited just to the judicial field, but
can be relevant to the research of DSSs in discretionary fields, since we can learn
about the differences between the human mind and machines.
SYSTEMS ARE TRYING TO BE THE IDEAL JUDGE? IMITATE JUDGES? SOMETHING ELSE?
When designing a DSS, the first thing to be done is to decide its goal. The
ultimate goal of the DSS is to improve judicial decision making, but there are several
ways to achieve this goal, and each system should choose one way (or maybe some
ways are better or more realistic than the others).
The first way is claiming that a well-suited DSS will perform as an "ideal
judge" would; the second is that a good DSS imitates human judges; and the third is
that DSSs do not try to do what judges are thought to do – but something else, and in
this way to correct flaws in adjudication.
19
Anna Ronkainen, From Spelling Checkers to Robot Judges? Some Implication of Normativity in
Language Technology and AI & Law, in PROCEEDINGS OF ICAIL 2011 WORKSHOP APPLYING HUMAN
LANGUAGE TECHNOLOGY TO LAW 48, 49-51 (Karl Branting & Adam Wyner, eds., 2011), available at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1879426. Lately there are some new researches
going in that direction; see, e.g., Aaron-Andrew P. Bruehl, Hierarchy and Heterogenity: How to Read
a Statute in a Lower Court, 97 CORNELL L. REV. (forthcoming, 2012), available at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1925396 (arguing that the lower courts should
interpret laws differently than the Supreme Court).
20
The distinction between hard cases and easy cases is not easy; see Owen Fiss, Against Settlements,
93 YALE L. J. 1073, 1087 (1984).
11
DRAFT – do not cite
The proposed distinction has two motives. The first one is to describe what the
developers of DSSs have in mind when they design the systems and what goal they
set to achieve, and what the systems they design achieve in action. The second is to
later rethink about the goals of such systems with regard to the boundaries of human
and computerized discretion.
I will start by explaining these three options.
IDEAL JUDGE – one might argue that the goal of DSSs is to be an ideal judge
and to reach the best decision for each case. This of course raises the question what
characteristics are requited from the ideal judge, and what is the process for reaching
the best decisions for a case. Only by understanding this, the developers of a DSS can
try to design it in a way that imitates the ideal judge.
Different scholars give different answers to the question who is the ideal
judge, and different answers can be inferred from different jurisprudential theories.
One famous example is Dworkin's ideal judge Hercules.21 According to
Dworkin, the aspiration of Hercules is that his decisions will be part of coherent
theory, beginning with the cases immediately in point, and if needed broadening the
scope of cases until he reaches a determinate answer. This description of Hercules, the
ideal judge, can serve as guidelines for designing a DSS. This DSS should be casebased and rely only on cases which are relatively similar to the case in question.22
Hercules is just one example to the concept of the ideal judge. Other legal
scholars give different answers to this question, and these answers determine the
21
See RONALD DWORKIN, LAW'S EMPIRE (1986). This is a very simplistic description of Dworkin's
theory.
22
The developers of the Split Up system had Dworkin's idea of the ideal judge in their mind when
designing the system, although the system is based on neural networks and not case-based as I
suggested. The system was designed to achieve coherence, and – for example – assume that if two
judges reach vastly different outcomes in similar cases, then one of them was mistaken. For a direct
reference to Dworkin's concept of the ideal judge see Stranieri, Zeleznikow, Gawler &
Lewis, supra note 4, at 165-66.
11
DRAFT – do not cite
design of a system which tries to achieve the goal of deciding like the ideal judge
would.
Until this point I referred to the DSS as trying to imitate the ideal judge.
Another option is to argue that the combination of a DSS and a judge operating it and
being able to apply her own discretion and to deviate from the DSS's outcome will
lead to results similar to that of an ideal judge. If we follow the example of Dworkin's
Hercules, a DSS can be programmed to find the coherent decision within the near
circle of cases, and if there isn't one – the system can send the judge to broaden his
search, or the judge can himself deviate from the system's outcome to achieve greater
coherence. In this manner, the combination of the human judge and the computer can
achieve ideal results.
It can be asked whether computerized systems can achieve results close to that
of an ideal human judge. The answer to this question depends on the perception of the
ideal judge. If one thinks the ideal judge should start discussing each case from start
and apply full discretion, then computerized systems cannot imitate this approach.
IMITATING
HUMAN JUDGES
– another way to understand the goal of DSSs is
that they are intended to imitate the decision making progress of human judges. With
this understanding a decision reached when using a DSS is similar to the decision
reached by an average judge (or an average of judges).
In general, systems which are statistical, and some of the case-based and
learning systems belong to this group. Statistical systems' output comes from statistics
of previous cases with similar outcomes. If the database on which the statistics are
based is large enough then the outcome reflects an average of these cases.
One might wonder what the advantages of using such a system are, since it
12
DRAFT – do not cite
does not produce good decisions but average decisions. I suggest several answers.
One benefit of using this type of DSS is that although it will not produce excellent
results, it will prevent bad outcomes.23 This can be considered by itself a positive
outcome.24 Since DSSs produce similar results in similar cases, no matter who
operates them, the use of DSSs will lead to more uniformity in judicial decisions –
which is another positive outcome.25 An additional benefit derives from the use of
computerized systems, which can lead to more efficient decision making process and
thus save judicial resources and ensure decisions are given in timely manner.
SOMETHING
ELSE
– the third way to state the goal of DSSs is more radical.
Under this understanding DSSs do not try to do what judges do and to reach decisions
which are similar to those of a (good / average) judge, but the goal of a DSS is to
operate the way computers operate.26 Those who are in favor of this approach argue
that adjudication as it is today has significant weaknesses, which computers do not
share, and therefore using computerized systems will improve judicial decision
making. Computers, in this context, have two qualities. The first is that they don't
suffer from biases and irrational behavior affecting human beings. The second is that
the computing power of modern computers can achieve complex tasks, such as largescale searches, that human beings cannot complete in reasonable time. Either one of
23
An important question is whether this kind of systems maintains existing judicial biases or fixes
them. If the operation of the DSS is based on past cases, it may duplicate results influenced by judicial
biases.
24
This claim is of course questionable, since it can be argued that it is better to have excellent decisions
even at the price of bad decisions.
25
It is a convention that uniformity is one of the goals of judicial system and that similar cases should
be treated equally. However, uniformity has its drawbacks, since it is not always easy to decide which
cases are similar and when there is a relevant difference between cases. Furthermore, law is an
evolving field, where much of the developments occur as a result of new excellent decisions, changing
existing rules, or as a reaction to bad decisions. Eliminating excellent and bad decisions, while leaving
only average decisions, will surely make these developments more difficult.
26
This approach relates to another one, which can be called "formalizing law". According to this
approach statutes should be designed in a formalized and deterministic way. See, e.g.,
L. Wolfgang
Bibel, AI and the Conquest of Complexity in Law, 12 ARTIFICIAL INTELLIGENCE & L. 159, 166-75
(2004).
13
DRAFT – do not cite
these qualities, and their combination, has an advantage for a judge using a DSS.
An example for a DSS which does not intend to resemble human judges is the
system developed by Schild and Kannai for evaluating of offender's previous criminal
record in sentencing.27 The authors admit that "there is no uniform approach to the
problem of how to relate to an offender's past record, and how to evaluate it" even
within each country. 28 Their solution is stating a list of rules in the form of formulas,
which are applied to offender's previous record and combined to produce an outcome
telling the judge what influence the previous record should have on the sentence.
Schild and Kannai do not believe that a judge (ideal or average) acts the way they
suggest, but have another approach which they claim has its own benefits.29 Their
system enjoys both qualities of computers: it is rational and does not suffer from
biases affecting sentencing, and it calculates complex formulas quickly and
automatically.30
However, the promise of this approach is also its threat - giving up on some of
the qualities of human judges. If we follow the example of Schild and Kannai's
system, judges have their way of examining past records and taking into account
special cases.31 One of the challenges of this approach is to gain the most out of this
combination – to maintain many of the qualities of human judges but to correct some
of the fallacies of the human mind with computerization power.
As I explained in the beginning of this chapter, the proposed distinction has
27
Uri J. Schild & Ruth Kannai, Intelligent Computer Evaluation of Offender's Previous Record, 13
ARTIFICIAL INTELLIGENCE & L. 373, 374 (2005).
28
Id. at 382.
29
Id. at 374.
30
Id. at 395-96.
31
One could also claim that the system is unnecessary since evaluating past records is a routine for
judges, and they do it quickly. On the other hand, the diversity in sentencing, and the fact that there is
no uniform method for such evaluation suggests that there is a price for this routine action.
14
DRAFT – do not cite
two goals – descriptive and normative. The observation is descriptive in two ways. It
tells us what the designers of the systems intend to do (or say what they intend to do),
but it also tells us what they do in reality. Many times it is a matter of definition, since
one's definition of an ideal judge will be seen by another as a combination of a human
judge and a computer. In the example I gave earlier I argued that Schild and Kannai's
system tries to do something else than what a human judge does, but it can also be
argued that they try to achieve what the ideal judge wants to achieve – an objective
and rational way of evaluating the past record of an offender. The answer changes
according to the theory and point of view of the looker.
I will deal with this point more broadly in chapter __, but I suggest that the
goal of DSSs will be to add something else to human judges, and it is better to admit
that computers cannot do exactly what human judges do. Human judges, even the best
ones, have their qualities, and – at least for now – computerized systems cannot
imitate them. Human judges also have their incapacities, and this is the place where
DSSs should enter.
There are some possible designs for DSSs, and some of them can take us to
the first two categories described above. For example, a statistical DSS can give an
output which is equivalent to the output of an average judge. But it is important to
understand that at the end of the day, DSSs are intended to correct some fallacies of
the human mind, even if it is just by being more efficient and helping the judge to
reach the decision more quickly.
15
DRAFT – do not cite
EVALUATING THE CORRECTNESS OF JUDICIAL DECISIONS
DSS, as other computerized system, need to be evaluated, in order to verify
that it fulfills its goals and has a benefit for the legal system.32 Evaluation has a few
components, and for now I focus on checking whether the system is built right, which
is defined as validation or verification.33
The design phase has a major part in assuring that the DSS will perform
properly. However, even when design is conducted well and the implementation
follows the design, there is a need to verify that the outcomes produced by the DSS
are satisfactory. There are several parameters which the outcomes should follow, and
one of the most important is
CORRECTNESS.
34
Nevertheless, correctness in the legal
field is a central yet complex concept. Legal scholars have been arguing whether there
is one correct result for each case, and how to find a correct result. For the purpose of
this discussion I define correct outcome as an outcome which is fitted to the legal
situation.
The complexity of the concept of correctness might explain the fact that many
32
It should be kept in mind that DSSs main goal is to benefit the legal system as a whole and not their
users – the judges (of course, it is a positive outcome if the judges are pleased with the systems, and
can contribute to their use, but the satisfaction is a byproduct).
33
See Maria Jean J. Hall & John Zeleznikow, Acknowledging Insufficiency in the Evaluation of Legal
Knowledge-Based Systems: Strategies Towards a Broadbased Evaluation Model, in THE EIGHTH
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND LAW 147, 148 (2001); Andrew
Stranieri & John Zeleznikow, The Evaluation of Legal Knowledge Based Systems, in PROCEEDINGS OF
THE SEVENTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND LAW 18, 20 (1999).
On evaluation of decision support systems see generally Cheul Rhee & H. Raghav Rao, Evaluation of
Decision Support Systems, in 2 HANDBOOK ON DECISION SUPPORT SYSTEMS 313 (Frada Burstein &
Clyde W. Holsapple eds., 2008); Richard S. Sojda, Empirical Evaluation of Decision Support Systems:
Needs, Definitions, Potential Methods, and an Example Pertaining to Waterfowl Management, 22
ENVIRONMENTAL MODELING & SOFTWARE 269 (2007).
34
Rhee & Rao argue that "it would be paradoxical to evaluate whether a solution by a DSS is correct or
incorrect", since for semi-structured problems – in which DSS operate – correctness is not a parameter.
See Rhee & Rao, supra note 33, at 313. However, in the legal context this argument could not stand.
This is shortly due to reasons of trust in courts and the broad implications of courts' decisions. See also
Dunsmuir V New Brunswick, [2008] 1 SCR 190 (Can.).
Correct decision can also be referred as "proper decision" or "satisfactory decision". See THE JUDICIAL
APPLICATION OF THE LAW 150 (Zenon Bankowski ed. 1992).
16
DRAFT – do not cite
of the papers presenting DSSs in the legal field do not include an evaluation part.35
Other papers do include an evaluation part, but there is no common method of
evaluation.36 I will now suggest four possible evaluation methods.37
In the first two methods, the outcomes reached by the DSS are compared to
expected outcomes in the same cases. When conducting such comparison, it should be
kept in mind, that many times the user of the DSS has a role in interpreting the facts
of the case when inserting them to the system, and therefore the outcomes of the
evaluation will be influenced by the user.38 The difference between the methods
depends on the way to define these expected outcomes.
One option is to compare the DSS to court's decisions.39 This is a good option
if one believes that court's decisions reflect the law. On the other hand, if a reason to
use DSS is that human judges have fallacies, than comparing performance of the DSS
to human judges' decisions means that DSSs maintain these fallacies. There are also
some pragmatic difficulties with this approach, since it depends on court's decisions,
and these do not always form a representative sample; are not always available to the
public (when dealing with lower instances); and do not always point out all the
35
For empirical data see id., at 151-53 (H&Z). For examples of researches which don't include an
evaluation part see Maria Jean J. Hall, Domenico Calabrò, Tania Sourdin, Andrew Stranieri & John
Zeleznikow, Supporting Discretionary Decision-Making with Information Technology: A Case Study in
the Criminal Sentencing Jurisdiction, 2 U. OTTAWA L. & TECH. J. 1 (2005);
36
Surely when there will be concrete discussions on incorporation of a DSS in the judiciary, it should
be evaluated more thoroughly.
37
The same classification can be found in Stranieri & Zeleznikow, supra note 33, at 20 ("Validation
criteria include comparisons with known results (eg past cases), comparison against exert performance,
comparison against theoretical possibilities."). Some systems combine two or more methods.
38
This fallacy was apparent in the evaluation of MOSONG: "Since the categorization of cases in the
validation was done by a domain expert (the author), it could certainly be argued that perhaps all the
expertise supposedly exhibited by the system was actually simply located in the user rather than the
system itself." Anna Ronkainen, MOSONG, a Fuzzy Logic Model of Trade Mark Similarity, in
PROCEEDINGS OF THE WORKSHOP ON MODELING LEGAL CASES AND LEGAL RULES 23, 30 (Adam Z.
Wyner, ed., 2010), available at http://ssrn.com/abstract=1879399.
39
For application of this method see John Zeleznikow, The Split-Up Project: Induction, Context and
Knowledge Discovery in Law, 3 LAW, PROBABILITY & RISK 147, 162-64 (2004); Edwina L. Rissland &
Kevin D. Ashley, A Case-based System for Trade Secrets Law, in PROCEEDINGS OF THE FIRST
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND LAW 60, 64-65 (1987); Yaakov
HaCohen-Kerner & Uri J. Schild, Case-Based Sentencing Using a Tree of Legal Concepts, 10 INFO. &
COMM. TECH. L. 125, 132-34 (2001); Ronkainen, supra note 38, at 30-31.
17
DRAFT – do not cite
parameters needed for the operating of the DSS.40
The second option is to compare outcomes reached by the DSS to the opinions
of experts (for example expert lawyers or ex-judges).41 This approach is based on two
assumptions: (a) there is a way to determine who is an expert in a specific legal field;
and (b) expert opinions reflect the law. Of course, this scheme also requires creating a
set of expert opinions that can be compared to the DSS's outputs.
Another evaluation scheme is to allow experts to review and evaluate outputs
of the system according to their expertise. Again, this method requires on the
determination of legal expertise, reliance on expert opinions as representing the law,
and creation of a set of cases for evaluation.42
An interesting method of evaluation checks the implication of the DSS in
reality. This method was suggested with regard to bail decisions – "ascertaining what
percentage of the releases (whether conditional or unconditional) were inappropriate"
since the accused has committed another crime.43 The method correlates with the
recent tendency to conduct empirical studies on the results of legal rules.44
The choice of evaluation model is influenced by a few parameters. First, it
depends on the view of the designers regarding what reflects the law – whether it is
40
See Zeleznikow, supra note 39, at 164 ("Many factors were left implicit in some judgements, which
Split-Up currently makes explicit.").
41
For application of this method see id.(Z), at 162 (the results of Split Up in three cases were compared
to those made by eight lawyers); M.M. Janeela Theresa & V. Joseph Raj, Analogy Making in Criminal
Law with Neural Network, in INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ELECTRICAL
AND COMPUTER TECHNOLOGY (ICETECT) 772, 774-75 (2011).
42
This method can also be applied when the DSS is not intended to reflect the existing law but to
change and improve it while using the advantages of computerized systems. In such cases there is no
point in comparing the DSS to existing decisions since a difference between them is expected. One
example is Schild & Kannai's system, which seeks to add uniformity and rationality to the question of
past record of an offender. Indeed, the system was not compared to courts' decisions or to expert
opinions, but rather applied on hypothetical cases, actual cases and reviewed by experts. See Schild &
Kannai, supra note 27, at 399-402. This approach was also implemented – as well as other validation
schemes explained – with regard to Split Up; see Zeleznikow, supra note 39, at 162-64. See also
HaCohen-Kerner & Schild, supra note Error! Bookmark not defined., at 132.
43
Patricia Hassett, Can Expert System Technology Contribute to Improved Bail Decisions?, 1 INT'L J.
L. & INFO. TECH. 144, 186 (1993).
44
See generally, Theodore Eisenberg, The Origins, Nature, and Promise of Empirical Legal Studies
and A Response to Concerns, 2011 U. ILL. L. REV. 1713 (2011).
18
DRAFT – do not cite
courts' decisions or expert opinions. If the DSS is intended to correct some biases and
diversity in judicial decisions, then comparing the DSS to courts' decisions could not
fulfill the designers' goals. Second, design of the DSS also influences the evaluation
scheme. For example, DSSs which are based on neural networks should have a large
"training set" consisting of previous courts' decisions. These DSSs cannot be applied
on cases which are in the training set for evaluation, since the system already has "the
right answer" for these cases, and due to the size of the training set the designers
might be left without new cases for the evaluation and will have to use hypothetical
cases. Third, the nature of the legal field is another factor to be considered when
deciding upon an evaluation model. While in some fields there is more importance to
uniformity and certainty, and less to the correctness of the outcome, in other fields –
such as criminal law – there is an emphasis on correctness leading to a more thorough
evaluation consisting on two or more schemes combined.
After deciding between the evaluation methods described, there are other
questions regarding the evaluation of a DSS. One of them is how to create a set of
representative cases on which the performance of the DSS will be evaluated. The
cases should check different aspects and components of the DSS.45 A related question
is whether the system must provide correct answers for every case – even for hard
cases, or it is sufficient that the system provides correct answer for ordinary cases. My
assumption is that DSSs are intended for "easy cases", while "hard cases" should be
left for human judges and for appeal instances. However, the distinction between easy
cases and hard cases is difficult – and some say nonexistent – and therefore it is likely
that the system will be applied on hard cases as well.46
Another question is what metrics can be used to evaluate the outcomes
45
See Stranieri & Zeleznikow, supra note 33, at 23; Schild & Kannai, supra note 27, at 399-401.
On the distinction between easy cases and hard cases see Owen Fiss, Against Settlements, 93 YALE
L. J. 1073, 1087 (1984).
46
19
DRAFT – do not cite
reached by the DSS. When comparing the outcomes by the DSSs to courts' or experts'
outcomes the differences should be measured.47 When using the third method of
evaluation there are no decisions that the DSS can be compared to. This means that
the evaluation can be even more difficult if there is some disagreement between the
experts.
In conclusion, the issue of evaluation of DSS for adjudication was not
discussed and implemented thoroughly until now. However, for the possibility to
incorporate DSSs in the judiciary to become realistic, DSSs should prove to supply
satisfactory outcomes, and therefore criteria for their evaluation should be developed.
47
For example, Split Up was evaluated using three cases and eight expert lawyers. Stranieri and
Zeleznikow indicate that "although predictions amongst lawyers were far from consistent, Split Up
system fell within the range of those made by lawyers on all three cases." It can be argued that the
inconsistency between the lawyers' predictions and the differences between their predictions and those
of Split Up cannot indicate that Split Up reached satisfactory outcomes. Stranieri & Zeleznikow, supra
note 33, at 23.
21
© Copyright 2026 Paperzz