The uses and abuses of neural networks in law.

Michael Aikenhead
Centre for Law and Computing, University of Durham
<Michael.Aikenhead@durham.ac.uk>

(A version of this paper was first published in  the Santa Clara Computer and High Technology Law Journal. Please refer to that article for any official purposes.)


Contents

I. INTRODUCTION.

Law has long been an area in which expert system technologies are applied. Numerous legal expert systems, computer systems that perform tasks normally regarded as requiring intelligence, have been created. However, the practical benefit of such systems has been below that predicted. The operation of early systems failed to account for the complexity and subtlety of the law and of legal reasoning.Researchers in artificial intelligence and law have investigated various proposals to make such systems more realistic. The last decade has seen a resurgence in interest in artificial neural networks (hereafter neural nets). Neural nets are computer models inspired by biological neural systems in the brain. Researchers believe that by mimicking the underlying structure of the brain they will be better able to mimic the intelligent tasks performed. In the field of law, it is believed that neural nets can overcome some of the limitations associated with existing legal expert systems.This paper will examine existing and proposed uses of neural nets in the law focusing on the jurisprudential implications and limitations inherent in those proposals.This paper is divided into six chapters. Following this introduction is a technical overview of neural nets, in order to outline their benefits and limitations. This is followed by a discussion of the nature of legal reasoning and the various models proposed to described it. With this background, chapter four examines various current and proposed uses of neural nets in law. Chapter five provides a jurisprudential examination of these uses. This paper concludes with some remarks on the uses made of neural nets in the law and the promise they provide for future research into the creation of legal expert systems.

Back to Start 
 

II. INTRODUCTION TO NEURAL NETS.

A Artificial intelligence.

In the artificial intelligence community there are several approaches to modelling human intelligence. One approach that has found application in the legal domain is the use of symbolic reasoning systems; to build what are called expert systems. Symbolic systems are so called because they rely on the transformation of symbols, which are taken to represent things in the real world, into other symbols according to explicit rules. [1]
Expert systems have a database of hierarchical rules, variables and constants that they apply to a given problem to try and determine a solution. [2] However symbolic systems have several limitations, including

(1) Not all knowledge can be stated symbolically.
(2) Developing and maintaining the system is time consuming.

The problem of trying to symbolically formalise knowledge can be enormous. It has been found that where there are 'grey areas' to a problem, the resolution of which involves the weighing of a multitude of factors, experts often reach a conclusion and then ex post facto justify it according to their hierarchy of symbolic rules. Rules then do not seem to capture all that is involved in expert knowledge. [3]
Secondly the actual construction and maintenance of the system is complex and time consuming; there is a 'knowledge acquisition bottleneck'. The system's creators have to explicitly code every rule and predicate manipulated by a symbolic reasoner. The system then has to be 'debugged' to ensure the database is free of errors and operates as predicted. Any changes made to the database, either through changes in or expansion of the knowledge of the system, have to be incorporated through the same time-consuming process. To make these systems 'intelligent' requires much work by a domain expert working in conjunction with a knowledge engineer. [4] Neural nets adopt an alternative approach to modelling intelligence. In neural nets, the relations between pieces of information do not have to be explicitly specified. Instead the neural net 'learns' the relationships between the information. For this reason neural nets are sub-symbolic reasoners; the system's designers do not have to explicitly state the relationship between pieces of information in the form of symbols. These aspects of neural nets have lead to resurgent interest in their use in 'intelligent' computer systems.

Back to Start
 

B Neural nets. [5]

Neural nets are computer models inspired by the structure of biological neural systems. Biological neural systems are composed of millions of neurons. Each neuron accepts input from the many thousand other neurons to which it is connected and in turn sends its output to many thousand other neurons. Neurons are connected by 'axons' and by dendrites. A neuron receives signals from other neurons through dendrites and sends its signal to other neurons through its axon. Where a dendrite connects to another neuron there is a 'synapse'. Synapses are 'plastic' in the sense that the strength of their connection to the neuron can increase or decrease. A strong signal passing through a weak synapse may have the same effect as a weak signal passing through a strong synapse. Synapses can also be inhibitory or excitory in that they can either inhibit the activity of the receiving neuron or increase its activity. The inputs that a neuron receives cause it to have some degree of excitation, this level of excitation results in the neuron generating a certain output which it in turn transfers along its axon to the neurons accepting input from it.
image 1
Structure of a biological neuron.

Neural nets mimic this structure. Neural nets are composed of 'neurodes'. A neurode is a mathematical model of a biological neuron. Neurodes are connected through synaptic weights to other neurodes; to create a network. What group of neurodes each neurode accepts input from, what output a neurode generates from its inputs and to which other group of neurodes the output is sent to, all determine the way the neural net will behave.
One of the major goals in neural net research has been to construct neural nets capable of learning. In biological systems, experiments show that one of the most important effects of learning at the cellular level is the modification of the strength of the synaptic connection between two neurons. Analogously, training a neural net, is a matter of modifying the values of the synaptic weights in the system. Unfortunately, training is a complex task and the method used depends on the architecture of the network being used.Contrary to the optimistic hopes of early neural net researchers, it is not possible to simply connect many neurodes in a random fashion and hope that they will perform a meaningful task; as in biology the neurodes must be connected in a particular structure.All neural nets however, operate as some form of pattern classifier. During its training the neural net learns to associate a certain pattern presented on its input with a certain pattern on its output, what is known as 'pattern association'. Further, neural nets have the property that they can generalise their input, 'pattern generalisation'. Neural nets can learn the characteristics of a general category of objects based on a series of specific examples from that category. This ability to classify patterns is retained even when the neural net is presented with partial patterns, the neural net will infer which general category the partial input belongs to. [6] Researchers have experimented with various structures for constructing neural nets ranging from the simplest single neurode to complex hybrid networks. The major drawback in simple networks is that they can only classify 'linearly separable' problems; [7] they cannot be trained to correctly classify every possible collection of patterns.This problem of linear separability can be overcome by using networks of three or more layers; it has been proved that such networks can map any input set to any output set, [8] subject to one limitation. All neural nets including these multi-layer neural nets, can only map contradictory input patterns by reaching a compromise between those input patterns. Neural nets cannot take one input pattern and map it to two separate outputs. The consequence of this will be discussed in chapter five.Adaptive filter networks are multi-layer neural nets, and are trained using 'back-propagation' techniques. [9] Although adaptive filter networks must undergo supervised learning [10] they are perhaps the most common form of neural net used.

image 3
General structure of an adaptive filter neural net.

While much more sophisticated neural nets than adaptive filter networks exist, many are extremely complex and are difficult to implement and tune. For this reason such networks remain largely at the research stage. Unless otherwise specified, general reference to neural nets in this paper will thus concern adaptive filter networks.

Back to Start
 

C Benefits of neural nets.

The use of neural networks in the creation of legal expert systems can overcome some of the limitations of symbolic systems. The ability of neural nets to make inferences from incomplete information and to classify patterns (both by matching past information and generalising that past information) make them promising candidates for use in various tasks performed by legal expert systems. Additionally, and extremely importantly the ability of neural nets to learn their information may aid in overcoming the knowledge acquisition bottleneck associated with symbolic reasoning systems.
As will be discussed in chapter four, various current and proposed uses for neural nets in legal expert systems attempt to exploit these properties.

Back to Start
 

III. INTRODUCTION TO LEGAL REASONING.

To be able to incorporate legal knowledge in a computer and to make the computer manipulate that knowledge to emulate the legal reasoning process, and thus the results achieved by lawyers, [11] it is obviously a prerequisite to know what the nature of law is and what the process of legal reasoning involves.This discussion will focus largely on the processes involved in legal reasoning, the actual steps undertaken when a lawyer is presented with a problem, decides on the applicable law, applies the law to the problem and so reaches a conclusion. A discussion of the nature of law itself is important to this examination. However, traditional jurisprudential debates such as those between natural lawyers, positivists and realists, concerning issues such as the validity of the law or the duty to obey the law, will not be discussed as they are peripheral to the present examination.

A Methods of reasoning.

There are three common methods of human reasoning :

(1) deductive reasoning,
(2) inductive reasoning, and(3) analogical reasoning.

Detailed expositions of each type of reasoning have been given elsewhere, [12] however a short explanation is worthwhile.

Deductive reasoning is a strict logical method of reasoning. Deductive arguments take the general form
(A) In any case, if p then q.
(B) In the present case p.
(C) Therefore, in the present case, q.

In this form of reasoning, one moves from the application of general rules to specific facts to deduce an outcome. The premises require and justify the conclusion. It is illogical to accept the general rule and the specific instance but to deny the conclusion. [13] However, the application of the general rule to the specific instance is contingent on that instance being regarded as a member of the general class defined in the rule. In terms of the above example, the 'q' referred to in line (C) must be regarded as similar enough to the 'q' in line (A) before deductive application of the rule can occur.
Inductive reasoning essentially operates as the reverse of deductive reasoning. Here one starts with numerous observations and then tries to relate them by creating a rule that can 'explain' each observation. For example the following situations are observed :
Facts
Outcome 
A B C D E F G
A B C D E F
A B C D E G K
M N O P E
Rules can be stated that 'explain' each outcome;

(A)'If (A B C D E) then X'; and
(B)'If (M N O P E) then Y'.

The validity of such rules though remains contingent. [14]
Analogical reasoning, in contrast to deductive reasoning and inductive reasoning, is not immediately concerned with the application of rules. Here one simply says that a certain outcome should result because that outcome has previously occurred in a similar case. This is a manifestation of the formal principle of justice that similar situations should result in similar outcomes. [15] Analogical reasoning and inductive reasoning are extremely closely related. [16] One only considers that the outcome of two situations should be similar because their facts are similar, by following the 'rule' that 'like case should be decided alike'. This can itself be seen as the corollary of the more general belief that if two situations have the same outcome then there must be a general rule that explains them. Thus in saying that two similar factual situations must have similar outcomes we are really saying that this would be the result from the application of a hypothetical general rule that would contain both situations. [17] Likewise, it is only possible to suppose that a general rule can explain several situations if one regards those situations as similar to each other.In this respect the mental processes in inductive and analogical reasoning are very similar. In both cases general rules explaining factual situations are assumed to exist; however, only in inductive reasoning does one take the step of trying to explicitly state those rules. More fundamentally, inductive reasoning and analogical reasoning are both inherently dependent on the finding of similarity between situations.

Back to Start
 

B Legal reasoning.

To what extent does legal reasoning involve each of the above types of reasoning ? This depends on the nature of the legal system: if the law is a system of rules, the use of induction and analogy will be far more limited than will be the case if law is not totally rule based. Not surprisingly questions about the nature of law are intertwined with questions about the nature of legal reasoning, as MacCormick states 'A theory of legal reasoning requires and is required by a theory of law.' [18]
Two views on the nature of law can be outlined:

(A) law is a series of well defined rules of universal application; and

(B) law is not rule based; legal outcomes are wholly dependent on the views of the parties, lawyers and the judge in a case. [19]

Of course, few if any jurisprudes adhere to these extreme versions of either approach. [20] While it would be fruitless to try and conclusively determine the nature of law, it will be argued that law's true nature does not lie at either of the extremes presented, and incorporates aspects of both positions.
Levi has given a useful breakdown of the process of legal reasoning, which he sees occurring in three steps

(1) Similarity is seen between cases,
(2) The rule of law inherent in the first case is announced; and(3) The rule of law is made applicable to the second case. [21]

While Levi's description of the legal reasoning process may not capture all that is involved in legal reasoning it does reveal that perhaps the key step in legal reasoning is the finding of similarity, or difference, between cases and aspects of a case.
In this context MacCormick and Burton note that the finding of similarity is dependent on the overall purposes that the legal system is trying to achieve. [22] The classification of facts for the purposes fitting them into the major premise of a deduction and for the purposes of creating analogies and inducing rules, occurs in a whole body of knowledge and theory we use to make sense of the world. [23] When deciding between competing fact classifications our evaluation inherently involves considerations of the consequences of each classification on our model of the world and in this sense similarities, dissimilarities, classifications, and thus the meaning and scope of rules are made and not found. [24]

Back to Start
 

Deductive legal reasoning.

That deduction plays a role in legal reasoning is difficult to deny. [25] Statutes are collections of relatively clearly-stated rules and thus the application of statutory law involves a large amount of deductive reasoning. [26] One begins with a statutory requirement, applies it to the facts and thus determines the outcome. However, 'applying the statute to the facts' is a complex process.
Firstly, before a deduction can occur, the facts have to be fitted within the language of the statute; this is a non-obvious step as facts can be logically characterised in several ways. [27] Thus a logical deduction can only occur if the facts are regarded as similar enough to the language of the statute to be classed as covered by the statute. As Levi notes

[T]he scope of a rule of law, and therefore its meaning depends upon a determination of what facts will be considered similar to those present when the rule was first announced. [28]

Secondly, there is the closely related problem of deciding what meaning to give terms within a statute. For example, the Crimes Act (1958) (Vic.) states

s.91(1)A person shall be guilty of an offence, if when not at his place of abode, he has with him any article for use in the course of or in connexion with any burglary ...

While classifying an article as within s.91 of the Act may be easy in some cases, this is not always so. What of a tool box ? All the items therein could be used during a burglary yet all could have legitimate uses. Whether an article is for use in the course of a burglary is a matter for debate.
This problem, of determining the meaning of individual phrases in rules, is called the problem of 'open texture'. [29] Resolving the problem of open texture is inherently dependent on the use of analogy. [30] Thus, even in this perhaps the most rule guided area of law, where all the rules are collected and clearly expressed, the purely deductive application of rules is not sufficient to solve all problems.Similar problems arise when reasoning in the common law. It is often said that there are common law 'rules'. However, in a strict sense this cannot be true. The whole of the common law has been created on an individual case by case basis. In a single case a judge can do no more than pronounce a decision that applies to the facts of the case. It could be argued that the ratio decidendi expresses the rule contained in a case. [31] This rule will be binding on all subsequent cases that have the same facts as the original case. However, the binding nature of the ratio decidendi (and thus the scope of the rule) is severely limited once it is appreciated that the ratio only applies to the strict facts of the original case. It will only determine the outcome of another case that has exactly the same facts, strictly, any change in the facts results in a new situation the outcome of which is not determined by the ratio decidendi. [32] The belief that there are common law rules arises because even though the ratio of one case may not be binding in a later case, if the latter case has very similar facts to the original case the ratio is nevertheless felt to be highly persuasive. [33] Thus the second case is decided similarly to the first. As this process continues, a large body of cases builds up, all of which have similar facts and similar outcomes. Seeing this collection of cases it is not unreasonable to assume that the original case laid down a general rule which dictated the results in all the latter cases. [34] In this way the common law appears to create rules that can later be applied deductively. [35] Even in such usage though, these common law rules experience the same problems as statutory rules.

Back to Start
 

Inductive and analogical legal reasoning.

The process of induction will often be used in framing the ratio of 'leading cases' and the construction of novel legal arguments. Before a leading case there often exists numerous cases with vaguely similar facts and similar outcomes. However, each has been decided in a relatively individual way. When a leading case is decided, the judges will look at the previous cases and surmise that since they have similar facts and similar outcomes, there must be a general rule or unifying doctrine that explains all the cases. In this way, the general rule-like pronouncements contained in the ratio of the leading case will have been induced from the previous cases. [36] The same process occurs when counsel advances such a new unifying rule in argument. As with reasoning by analogy the use of inductive reasoning in the law can be seen as an inherent consequence of the requirement for coherence within the legal system as enunciated by MacCormick. [37]
The use of analogical reasoning in law has been widely studied. [38] As with deductive legal reasoning and inductive legal reasoning, such descriptions emphasise the necessity of finding similarities between cases before any analogy can be constructed. [39] Assume that the following cases have the following factors :

Case 1: A B C D,
Case 2: A B C E,
Case 3: A B C F,
Case 4: A B C G.

Further assume that Case 1 and Case 2 are regarded as analogous. From this it can be implied that factors D and E are similar. Again, assume that Case 1 and Case 3 are not regarded as analogous, implying that factors D and F were not similar. How is Case 4 to be classified ? This depends on whether factor G is regarded as more similar to factor D or more similar to factor F.
A consequence of the importance of the finding of similarity to the process of legal reasoning is that extreme versions of legal positivism do not seem supportable. Since deductive reasoning is by itself insufficient to explain legal reasoning, law must be composed of more than purely rules. Nor however, can it be accepted that legal reasoning is totally subjective, [40] legal rules provide a 'paradigm' which guides legal thought. [41] This view of legal reasoning, as a process inherently dependent on the finding of similarity between situations and on our world theories, has consequences for the use of neural nets in legal expert systems. These consequences will be explored in the following chapters.

Back to Start
 

IV CURRENT AND PROPOSED USES FOR NEURAL NETS IN LAW.

Mirroring resurging interest in the general artificial intelligence community, the use of neural nets in law has recently received growing interest. Neural nets have been used and have been proposed to be used in the law in two broad manners

(1) as and within inference engines [42] in legal expert systems; and
(2) in legal information retrieval systems.

This chapter will discuss and explain each of these proposed uses. The following chapter will discuss some of the jurisprudential implications arising from these proposed uses.

A Neural nets as and within inference engines.

Before examining the current and proposed uses of neural nets in the law, it is beneficial to have an understanding of more traditional techniques for computerising legal knowledge.
There is no definitive definition of what constitutes an expert system [43] however, for the purposes of this discussion the following loose definition will be adopted. A legal expert system is a computer program capable of performing tasks usually performed by a lawyer at the standard of (or at a higher standard to) a human expert in the legal field. [44] In this respect expert systems and information retrieval systems are very similar, both require aspects of intelligence. However, only in an expert system does the system try to reason with the law.

Back to Start
 

1 Traditional legal expert system inference mechanisms.

It has been widely noted in the artificial intelligence and law research community that the dominant reasoning paradigm used in legal expert systems is that of symbolic reasoning. [45] Both production rule expert systems and symbolic case based reasoners adopt this approach. Production rule expert systems seek to encode law in the form of rules of logic. Symbolic case based reasoners [46] encode aspects of cases, such as the factual attributes, which then undergo transformations and are reasoned with according to explicit rules. [47]
Several problems are inherent in the symbolic approach.

Back to Start
 

2 Problems with traditional inference mechanisms.

A major jurisprudential problem with symbolic reasoning systems is that they depend upon legal knowledge being composed of explicitly stateable rules. As argued in the previous chapter, law is composed of more than solely rules.
Further, the developers of symbolic reasoning systems assume that deductive reasoning is the only mode of reasoning applied in the law. Although analogical reasoning is emulated in symbolic case based reasoners, what is actually implemented in such systems is only a crude simulation of human analogical reasoning. Symbolic case based reasoners rely on explicit rules of how cases and case attributes can be manipulated. It is assumed that through the deductive application of these rules, analogical reasoning will emerge. As a complete model of analogical reasoning, this is a dubious assumption.Thirdly, symbolic reasoners experience difficulties in resolving conflicts between rules in their rule databases. [48] Such conflicts can only be resolved with meta-rules. Again, this assumes that the law is a deepening spiral of rules, which is jurisprudentially suspect. [49] Finally, there are practical problems in the creation of symbolic reasoners. Such systems experience a knowledge acquisition bottleneck. [50] Any changes in the relevant law will require a modification of the database, which then needs to be debugged; a time-consuming process.The use of neural nets in legal expert systems has been proposed as a means to overcome these problems. These proposals will be examined below.

Back to Start
 

3 Proposed uses for neural nets in legal expert system inference mechanisms.

(a) Reasoning with cases.

It was argued in chapter three that the law cannot be regarded purely as a system of rules; analogical reasoning from past cases is extremely important. Neural nets may find application in systems that reason with cases. Warner has stated that a large benefit of neural nets is their ability to classify patterns and so imitate the analogical reasoning process, thereby resolving issues of open texture. [51] However, before this claim can be sustained the precise nature of the analogical reasoning process needs to be investigated.

Back to Start
 

Analogy.
As outlined in the previous chapter, an important aspect of analogical reasoning is the classification of patterns into similar groups; only things that are similar can be used in an analogical argument. Neural nets are inherently good at pattern classification, [52] which makes them seemingly promising candidates for emulating the analogical reasoning process.
The use of neural nets to mimic this aspect of analogical reasoning has been investigated by Hobson and Slee. They have produced a neural network 'index' of the Theft Act 1968 (England). [53] In this index, a factual situation is analysed by the researchers for the presence or absence of various concepts, the concepts being specified by the wording of the Act. The presence or absence of each concept results in a matrix that is then used as the input to their neural net. The verdict on whether or not the factual situation constitutes theft within the meaning of the terms of the Act is used as the desired output for the neural net.
image 4
Using this material Hobson and Slee claim a neural net can be trained to classify cases covered by the Act. [54] During training the neural net groups the cases used to train it into general groups.
image 5
Neural nets group similar cases together.

Once trained, new cases can be presented to the neural net. In reaching a verdict on a new case, the neural net classifies the case into one of the general groups created during training. In so classifying a case, the neural net appears to mimic analogical reasoning; similar cases result in the same verdict.
Similar work has been performed by Bench-Capon who has created a neural net based on a hypothetical statute. [55] Bench-Capon's investigation is of further interest in that it demonstrates that a neural net can successfully perform classifications even when presented with a lot of 'noise' (inputs that are not relevant to the classification). Thus, contrary to what other commentators have said, [56] neural nets have the potential to operate successfully even when the factors affecting the classification are not known. [57] In contrast to the above two approaches which essentially try to model whole areas of law using neural nets, Walker et al (the 'VUA team') 'simply' use neural nets within a more conventional case based reasoning system. [58] The VUA team have created PROLEXS, a 'hybrid' legal expert system, which relies on more than one model of legal reasoning. Early versions of the system operated by having a stored database of cases, each case being stored as a set of 'conditions' each with an associated fixed weight, along with a case threshold. [59] When a case was to be applied analogically, the weights on conditions present in the current fact situation were summed and then compared to the case threshold of the past case to determine whether the current situation was analogous to the stored case. [60] In the first implementations of PROLEXS, the condition weights and the threshold values had to be assigned by the domain expert. However, the VUA team note that weight and threshold assignment is a difficult task for a human domain expert. [61] Consequently, the latest version of PROLEXS dispenses with the case database within the case based reasoning sub-system. Instead, a multi-layer neural net is trained using the conditions as the inputs and the applicability or non-applicability of the open texture term as the output to the neural net. [62] The neural net learns the condition weights and case thresholds during its training. This is essentially the same approach as taken by Hobson and Slee, and Bench-Capon. It is claimed that this system can provide more discerning weights than a human expert. [63]

Back to Start
 

Open texture.
Bench-Capon states that in creating his neural net the ability of neural nets to perform classification in domains involving open texture has been demonstrated. [64] This claim must be questioned. Bench-Capon did not use a neural net to model solely open textured aspects of the domain, but the whole domain itself. This can be regarded as an ironic response to Sergot's attempt to model law purely using rules [65] and the criticism and comment that this created. [66]
Similarly, when creating their neural net Hobson and Slee treat issues such as whether an action was 'dishonest' (which is an open textured issue) as simply any other index point. [67] It is up to the creators (and presumably later users) of the neural net to decide whether these concepts are present before input is given to the neural net.The work by Hobson and Slee and Bench-Capon thus should strictly not be viewed as a demonstration of the ability of neural nets to resolve open texture, but simply as a demonstration of the ability of neural nets to classify legal cases into desired legal categories.At this point, the ability of neural nets to resolve issues of open texture may be doubted; this is not an area that has yet been directly investigated. The work however, has a prima facie appeal: in using neural nets to classify cases it appears that analogical reasoning is being emulated and it is by analogical reasoning that open texture may be resolved. In the next chapter the uses of neural nets in computerised analogical reasoning systems will be discussed in more detail thereby giving more credence to the claim that neural nets can aid in the resolution of open texture.

Back to Start

(b) Dealing with conflicting rules.

A number of researchers have proposed the use of neural nets to overcome the difficulty of reasoning with conflicting rules of law. Warner claims that neural nets can inherently model the legal reasoning process and that neural nets can model a legal system in which rules conflict by giving those rules weights. [68] However, no details are given as to how this is to be achieved.
Philipps agrees that the ability to model conflicting rules is a benefit of neural nets and has created a neural net designed to investigate this. [69] Philipps claims that his neural net can mimic the results that German courts reach when they assign liabilities in automobile accidents and specifically that his neural net can mimic the process that occurs when a court is presented with contradictory cases. Contradiction is dealt with by reaching a compromise solution to the conflict. [70] However, while the ability of neural nets to reach compromise solutions is important, as will be discussed later, it may not always be jurisprudentially desirable.

Back to Start
 

Inference networks.
In contrast to the approaches of Warner and Philipps who model conflicting law on a rule by rule basis through the use of compromise, Thagard has developed a theory of explanatory coherence that he says can 'choose' between competing hypothesis. Thagard's system, ECHO, models competing theories using a neural net. [71] ECHO choses between conflicting groups of rules [72] and does not deal with the conflict through compromise, but accepts or rejects one of the hypothesis. [73]
The notion of competing theories should be familiar to lawyers and legal theorists, in any conflict there are always at least two competing theories of the law and the facts, that presented by the plaintiff and that presented by the defendant. Prima facie ECHO seems to provide a possible way to model this conflict in a legal expert system. Thagard has applied this theory in a simplistic manner to two legal situations however, the possibilities in this approach remain largely unexplored. [74]

Back to Start
 

(c) Machine learning.

Along with the above problems with knowledge representation and manipulation in symbolic reasoning systems a further problem with such systems is the expense in developing and maintaining the knowledge base of the system. [75] Neural nets in contrast learn their knowledge which provides a further attraction to their use in legal expert systems.
If a neural net is used to store cases, as in some of the work by Hobson and Slee and in later versions of PROLEXS, then adding new cases to the neural net's knowledge base is simply a matter of presenting those cases to the neural net while it is in its learning mode. The neural net automatically incorporates the cases into its knowledge base through modifying its inter-node weights. Cases still have to be described in terms amenable to use in the neural network; however, later re-rationalisations of those cases [76] only require the net to be retrained, rather than requiring the complete re-entry of a newly structured case database.An additional way to use neural nets to overcome the knowledge acquisition bottleneck is through their use in rule induction systems. [77] Such systems attempt to model the process of legal induction by examining numerous cases and attempting to find relations between factors in those cases that can be explained with a rule. Although several researchers have proposed the extraction of rules from neural nets to enhance their explanation facilities [78] neural nets have also been used in an attempt to extract rules from a corpus of cases. In their work on the MAIRILOG project, Bochereau et al propose methods by which logical rules can be extracted from trained neural networks. [79] The rules extracted from the neural net could then be incorporated into a symbolic reasoning system. Again, the ability of neural nets to learn new cases is exploited.Symbolic methods also exist to extract rules from a corpus of cases, and as both the use of neural nets and symbolic methods rely on a statistical analysis of data, attempts to use neural nets to induce rules share the limitations of these symbolic automatic rule induction systems. [80] However, the potential for neural nets to incorporate more flexible notions of analogy [81] than those currently used in other induction systems may overcome some of the limitations currently inherent in automatic rule induction.

Back to Start
 

B Use of neural nets in legal information retrieval systems.

In contrast to the above neural net applications which try to reason with the law, neural nets have also been used in systems that solely try to retrieve information. Computers have long been used to automate legal information retrieval however, all have suffered limitations. [82] Most notably, they are 'brittle' in that they rely on keyword searches. [83] Systems employing neural nets remove some of these limitations.

Back to Start
 

SCALIR.

One of the most interesting legal information retrieval systems using neural nets is SCALIR. [84] Created by Rose and Belew, SCALIR is a combination of a neural net embodying sub-symbolic information integrated with a semantic network [85] embodied in a neural net. SCALIR can perform impressive document retrieval operations. For example, Rose and Belew show how SCALIR can retrieve a copyright case using a term that does not occur in that case. SCALIR is able to overcome superficial differences in the topic through its network of term associations. This is more than a simple synonym search though as the system could also perform such a retrieval based on the fact that two cases have cited a common case.
An interesting aspect of SCALIR is that one of the neural nets is used not to store conceptual features of the domain, but rather 'micro-features', such as the fact that a particular word is used in a case or the fact that two cases are often retrieved together. [86] This contrasts with the use made of neural nets when reasoning with cases, where the inputs and outputs were all at the conceptual level.A further notable feature of SCALIR is its ability to learn from interaction with its users. [87] The system modifies the weights on links within the networks, depending on the type of searches performed by users. Thus, Rose and Belew claim that over time the system can adapt to new terminologies and the changing importance of cases and statutes. [88] SCALIR is an impressive advance over prior information retrieval systems and as Rose and Belew indicate, the benefits provided by SCALIR should not be regarded as lying solely in the field of legal information retrieval. [89] It will be argued in the following chapter that the combination of sub-symbolic and semantic information contained in SCALIR's neural networks is a powerful and flexible method of emulating the finding of similarity required in legal reasoning. Thus the techniques embodied in SCALIR could find wider application as part of a legal expert system.The above discussion has provided an introduction to the uses to which neural nets have been put in the law. However, there a number of problems and concerns associated with these uses of neural nets. The following chapter will discuss several more theoretical proposals for the use of neural nets in law and then proceed to examine problems and concerns inherent in these proposals and the uses discussed above.

Back to Start
 

V. JURISPRUDENTIAL AND TECHNICAL CONCERNS IN THE USE OF NEURAL NETS IN LAW.

A Jurisprudential concerns about proposed uses.

As stated in chapter three, the nature of law and the nature of legal reasoning are two issues that are inherently intertwined. There is an oft-made claim that legal expert systems will provide information about the nature of law and the process of legal reasoning. [90] It is hoped that the use of neural nets in legal expert systems will aid in this. However, for jurisprudence to gain from the creation of legal expert systems and specifically from the use of neural nets, lawyers and legal theorists must be confident that the use of those neural nets rests on a solid jurisprudential basis.
This chapter will commence with a discussion of two claims about the nature of law which it is said that neural nets are suited to modelling. Following this discussion are several jurisprudential observations specifically concerning current and proposed uses of neural nets in the law. Finally a discussion of how neural nets can offer a new metaphor of law will be presented.

Back to Start
 

1 Law as a parallel process.

Amongst those researchers who advocate the use of neural nets in the law perhaps the most wide-ranging and controversial position is that taken by Warner. [91] Warner is of the view that legal reasoning is an inherently parallel process [92] and that

[W]hen we attempt to model the legal reasoning process, we must use a device capable of emulating the parallel problem-solving process. To this end, normal digital computational devices are inadequate. [93]

It is claimed that neural networks will overcome this problem due to the inherently parallel nature of their operation. [94] If taken to its full extreme, Warner's view of the legal reasoning process as inherently parallel has potentially fatal consequences for traditional symbolic systems. However, apart from such vague and dubious observations about the nature of legal reasoning the full implications of this idea are not explored.
That the legal reasoning process is an inherently parallel process is highly contentious. It seems acceptable to say, as Warner claims, that when problems are solved the solution of 'unit problems' will impose a

state change on the problem domain rendering invalid all unit solutions previously achieved and changing the environment for all unit solutions yet to be achieved. [95]

However, this is not a description of a parallel process. This simply notes that the answer to one question may change which questions are subsequently asked. While this undoubtedly occurs in human reasoning, the contingent nature of questions is quite easily represented in a tree diagram. [96] Such tree diagrams form the basis of all rule based expert systems. Systems such as PROLEXS [97] display this 'parallel' problem solving capability by modifying subsequently asked questions according to intermediate answers. This belief that neural nets can solve all the problems that currently beset symbolic legal expert systems is, perhaps unconsciously, echoed by Bench-Capon. He has attempted to model what prima facie appears a rule based area of law, with a neural net. [98]
If it is accepted that some legal reasoning occurs in 'parallel', it still does not mean that all legal reasoning does. It is not in every legal question that, as Levi would say, the application of the rule changes the rule itself. Thus Warner's vision of the necessity of using neural nets to model the supposedly parallel nature of the legal reasoning process cannot be supported.

Back to Start
 

2 Open texture as randomness.

A further contention made by Warner is his equating of the concept of 'open texture' with the idea of randomness. [99] If this view of open texture is correct then little hope can be held for the ability of lawyers and legal theorists, let alone neural nets or any other legal expert system, to resolve issues of open texture. After equating open texture and randomness it is surprising that Warner then claims that the use of neural nets can overcome this problem. [100]
Perhaps Warner's choice of the term 'randomness' was ill-advised. Warner cites and accepts the work of Gardner [101] in his argument for the benefits that neural nets could provide. [102] However, Gardner suggests that open texture can be dealt with through the use of heuristics, [103] a claim Warner accepts. [104] Logically though if heuristics exist in a domain then that domain cannot be regarded as truly random. Warner is correct however in viewing open texture as an 'indeterminacy'. [105] As argued in chapter two, the resolution of open texture does not occur unconstrained, but proceeds through a process of analogy from past cases. During this reasoning process, the factors that can be taken into account and the manner in which they can be used are both constrained. [106] However, the work of critical legal scholars does show that what influences which factors will be emphasised or even considered in a decision may be influenced by 'extra-legal' factors, which could make strictly legal examinations of concepts conclude that those concepts were random. [107] Thus the claim that open texture involves randomness does have merit in that it highlights the unpredictability of solutions. Even when all the past cases have been rationalised, the possibility remains that this rationalisation will be destroyed if the present case is decided in a novel manner; in neural net terms, with the addition of a new input factor. [108] However, even this possibility is constrained by the need for coherence within the legal system and by the need for the distinction made to be justifiable. [109] So while the resolution of open texture may be difficult and sometimes unpredictable, it is not random. This resolution of problems involving open texture is inherently intertwined with the nature of analogical reasoning, a subject that will be discussed next.

Back to Start
 

3 Analogy and explanation.

Every legal expert system necessarily embodies a jurisprudential theory. [110] The use of neural nets in legal expert systems will affect the nature of that jurisprudential theory.

(a) Analogy.

As noted in chapter two, a key step in analogical reasoning is the finding of similarity, or difference between cases and aspects of a case. Of course, legal analogical reasoning is much more than the mere finding of similarity between cases. Once two cases are found to be similar there are limitations on the way the cases can be applied. [111] However, when can two things be regarded as the same or different ?
Theories of similarity.
Mital and Johnson note that there are no entirely satisfactory theories of what constitutes similarity. [112] According to the ruleless theory, there are no general principles applied in a finding of similarity, people know it when they see it. [113] An alternative, that similarity is found solely by calculating the number of shared attributes that are present in a situation cannot be accepted. [114] If the ruleless theory of similarity is accepted, then little hope can be held for any formalisation of the process of finding similarity; however, if it is accepted that some guidelines are followed it must be appreciated that it is not merely the number of attributes that are shared, but also their relevance. [115] In this context, Tito has said that two things are necessary for a computer to understand 'similarity'

(1) they must understand the analogue meaning of words and
(2) they must understand moral decision making. [116]

According to Tito it is necessary to understand the analogue meaning of words to determine whether something is within a general category. Similarly, moral decisions must be made when determining at what level of generality things are to be compared.
While Tito says she is not interested in whether computers can mimic the results achieved by lawyers, but whether they can actually understand analogical reasoning, [117] her work does not consider the philosophical problems of what constitutes intelligence and understanding in computers. [118] Tito's work is still informative however, if viewed as a discussion on the ability of computers to mimic the results achieved by lawyers.A problem that faces all legal expert systems, including those that incorporate neural nets, is that they only model legal concepts. It is unavoidable that when an issue in the real world is to be considered by a computer, it has to be circumscribed by a limited number of factors. This circumscription will inevitably involve a loss of richness and the creation of a conceptual bias [119] in the computerised representation of the concept as compared to the real world concept. In Tito's conception, the computer only has a digital representation of concepts. Though this loss will be inversely proportional to the complexity of and dependent on the composition of the matrix used in the circumscription, if the input matrix does not accurately reflect the real world concept then the conclusion drawn by the legal expert system will not be accurate.It is as yet unclear whether the necessity of understanding moral decision making for the finding of similarity is a fundamental bar to computers performing analogical reasoning. Computers may yet be implemented that do this, though what this entails is presently unclear.
Similarity and neural nets.
In the quest to find similarity neural nets can conceivably be used in several ways
(a) by comparing matrices of factors
(b) by determining weights to be given to factors that are used in other systems.
(c) by identifying new factors that are common to members of a group
(d) by determining similarity in a less reductionist fashion than the above.

For present purpose, approaches (a) and (b) are essentially the same. Both rely on matrices of factors being presented to a neural net. Although a neural net can classify patterns, deal with complex relationships and subtle variations in factors, and so determine similarity by determining how many attributes are shared, a key aspect of the finding of similarity has already been performed by the designer. The designer of the system has already made the all important decisions as to what limited factors are to be considered relevant for a determination of similarity and further at what level of generality they are to be compared.
In this scenario Tito's requirements mean that the computer can only find similarity at the level of attribute matching, more subtle aspects of similarity are outside the computer's scope. For this reason systems such as PROLEXS that adopt the matrix approach will only ever have limited ability to reason analogically.However, if a matrix can be chosen that can accurately model a real world concept [120] then that matrix can be implemented in a neural net. This is a corollary of Kolmogorov's theorem. [121] A key requirement in this approach is choosing the matrix used to represent the concept, but what factors are to be included ? Neural nets could also conceivably be used to identify new factors that are common to a group. Bench-Capon shows how neural nets can find which factors are significant amongst noise [122] but claims that the significance of these factors cannot be understood without independent knowledge of the domain. [123] To say that the significance of such factors cannot be understood without prior domain knowledge though is not to say that the newly identified factors are not significant. According to some members of the critical legal studies movement, the reasons given in cases are not the whole reasons for the reaching of the results in those cases. [124] If this view of law is correct then legal analysis and legal expert systems based solely on those decisions will not accurately reflect how and why cases are decided. Instead one should simply look at what actually occurs. Thus when an analysis of a neural net highlights the importance of an unsuspected factor this could be interpreted as telling us something important about the underlying legal domain. Consequently, the use made of noise is dependent on the jurisprudential theory that the system's developers adopt; whether it is interpreted as a discovery about the law or is rejected as a technical anomaly.The most promising approach to modelling similarities is the less reductionist approach taken by SCALIR. [125] Here similarity is not judged solely on the presence or absence of specified factors, but also on the presence of sub-factor information. Thus even though two input matrices may share few factors at the conceptual level they can still be regarded as similar if they directly or indirectly share common 'micro-features'. In this respect SCALIR contains a closer approximation to employing the analogue meaning of words than do other systems. However, before SCALIR type similarity determination can be implemented in a legal expert system, rather than solely a document retrieval system, the systems developers will have to choose how indirect a sharing of common micro-features will amount to two objects being regarded as similar. This is equivalent to choosing at what level of generality the two objects are to be considered. Further, the approach adopted in SCALIR is still dependent on the system's designers choosing what concepts are to be used to model the legal domain. Thus within Tito's framework it is still not possible to say the system implements moral decision making. However in incorporating a closer approximation to the analogue meaning of words, the method to determine similarity adopted in SCALIR is more subtle than those in other neural net systems or that exist in symbolic reasoning systems.It cannot be doubted then that neural nets can mimic the finding of similarity, though on a restricted basis. However, the accuracy of the similarity found will depend greatly on the composition of the matrix chosen by developers to describe the legal concepts.
Open texture.
Two observations about the use of neural nets to resolve open texture can now be made. Since the similarity found by neural nets is crude compared to that achieved by humans there is much scope for real world decisions to differ from those reached by neural nets; because unconsidered factors will have been taken into account. [126] Secondly, legal analogical reasoning is not simply the finding of similarity between cases but involves manipulating the analogy found to achieve a desired result. [127] This is something that neural nets of themselves cannot perform. Consequently, by themselves neural nets have a limited ability to perform analogical reasoning. The ability of neural net systems to generalise input patterns and to perform a flexible form of similarity determination however, makes them strong candidates for use in a hybrid analogical reasoning system

Back to Start
 

(b) Explanation and justification.

The use of neural nets as legal analogical reasoners faces a further problem. Mital and Johnson state

Similarity cannot be thought of as an agent independent of the objects which are to be found similar; it may be said that it is more in the nature of a relation which the mind perceives after the fact. [128]

Since similarity does not exist independently of our perception of it, creating this perception is of crucial importance. Unfortunately this presents problems for neural nets. Presently neural nets take a series of inputs and oracularly produce an output; it is left to the user of the system to infer why similarity was found.
Creating such a perception involves two things, explaining why the similarity was found and then justifying the finding. Several methods have been proposed to get explanations and justifications from neural nets, four of which are

(1) extract rules from the neural net; [129]
(2) present to the user those nodes (factors) that had a positive contributory influence along with those that had a negative contributory influence on the decision; [130]
(3) present the training set of the neural net to the user; [131] and
(4) create a hybrid system where the output of the neural net is explained ex post facto by other systems. [132]

The essential purpose of providing explanation and justification is always to convince the human end user of the correctness of the result achieved and in this respect the intended audience and use of the system must always be remembered. [133]
Gallant has given a detailed analysis of how rules can be extracted from neural nets, [134] though as Bench-Capon notes, we cannot be sure of the correctness of any rules derived from a neural net unless we have prior knowledge about the domain itself. [135] However, while rules may provide an explanation of a result it is hard to regard them as a justification. If a domain expert was asked 'How did you reach that conclusion ?' a first answer might be 'It just came to me'. Pressed further, the new response might be 'Factors X, Y and Z were present and this points to that result'. A neural net can give a similar explanation by saying 'Factors X, Y and Z were present and this points to the result because they achieved that result in other cases'. The expert (or neural net) might go further and formulate this last response with a rule such as 'Whenever factors X, Y and Z are present, then this result was achieved.' As an explanation this seems satisfactory, it was because of experience that the expert and neural net gave that result. A search for more detailed explanation from a neural net if even possible, [136] seems unnecessary.Asking why the result is justified is different. What amounts to sufficient justification for a decision depends on the jurisprudential theory of law that one subscribes to. If one regards as justified, a decision based solely on the fact that such a decision was reached in past situations, then 'if ... then ...' rules as discussed above may be accepted as both explanation and justification; they are simply a short-hand way of saying this. However, if one's jurisprudential theory requires a more detailed justification then it remains an open question whether a neural net can justify its results. Detailed justification may be possible using other systems; although a pre-requisite is the adoption of a jurisprudential theory on what amounts to justification.Proposals (b) and (c) for achieving explanations and justifications from neural nets are slightly different. In both cases it is simply left to the user to infer why the information presented justifies the result achieved. The PROLEXS team state that these approaches have not proved satisfactory. [137] Proposal (d), that of justifying the output of a neural net ex post facto has not yet been reported as implemented though theoretical work is underway. [138] Thus it can be seen that neural nets have a limited ability to justify their results. Whether this poses a serious problem to the use of neural nets in law remains to be seen.

Back to Start
 

4 Inducing rules.

The work of Bochereau et al and Bench-Capon has demonstrated that it is possible to extract rules from a trained neural net. Such rule induction though suffers the same problems as symbolic rule induction systems. [139] Essentially, all such rule induction is based on a statistical analysis of the underlying data, and thus if the data is not statistically representative then any rules induced could be spurious.
However, the flexible notion of similarity able to be embodied in a neural net may make neural nets more useful rule-induction systems than are current symbolic systems. As with analogical reasoning, a key step in inductive reasoning is the finding of similarity between cases so as to found a general rule. In traditional systems such similarity is simply based on the presence or absence of factors chosen by the system designer. As discussed above, SCALIR embodies a more flexible notion of analogy than simple attribute matching. For this reason a rule-induction system adopting SCALIR's concepts might be able to create relations and thus rules that would not be found with a symbolic system. It is conceivable that the more flexible approach to analogy embodied in SCALIR would improve rule induction. No work however, has been undertaken on this point.

Back to Start
 

5 Compromise.

Systems that model conflicts between rules by using compromise were discussed in chapter four. However, the use of neural nets to model contradictory rules is problematic. Philipps states that

The neurons strive for equilibrium, and when the conditions of the equilibrium are translated into the terms of the case, the resulting solution cannot be totally unjust. [140]

The equating of justice with compromise is questionable. Firstly, the two rules that were balanced may violate principles of formal justice, or they might offend against moral principles in which case the resulting compromise cannot be said to be just. More fundamentally, justice does not necessarily equate with compromise. If justice is understood as meaning 'The result that a court of law would reach.' then equating justice with compromise is unsupportable. Courts do not always achieve a result that is a compromise of the presented claims. [141] The point is not that compromise is never just or that what a court of law would do is just, only that in equating justice with compromise, a jurisprudential statement is being made that requires support. Thus attempting to deal with conflicting rules through the use of compromise is not a necessarily a desirable path. It depends on one's theory of justice. [142]
Used in the manner of Philipps, neural nets can only deal with contradiction through compromise. [143] Thagard's ECHO [144] has the potential to overcome this difficulty as it does not model conflict through compromise. However, ECHO has problems of its own, [145] not least the complexity of its representations. Since Thagard has not given detailed discussion of the legal use of ECHO the possibility of using this system to model conflicting legal rules remains to be explored.

Back to Start
 

6 SCALIR learning.

Similar jurisprudential considerations arise from the proposal to make SCALIR learn from its users. [146] As an information retrieval tool this seems reasonable, any reasoning will be performed by the lawyer using the retrieved documents. However, if SCALIR were to be incorporated as part of a larger legal expert system then such learning may not be justifiable.
Under Rose and Belew's proposal, learning in SCALIR would alter the very representation of the documents in the system. Consequently a legal expert system adopting this approach would potentially alter its representation of the law each time it was used. Such a scenario has elements of the critical legal studies view that law is whatever we make it. [147] Aspects of this view may be true in the case of real lawyers and real judges however, it is slightly bizarre to extend this to a legal expert system which has no direct affect on actual legal outcomes.

Back to Start
 

7 Normative Reasoning.

Finally, it has been suggested that neural nets cannot model the legal decision making process because they cannot apply norms. [148] This is debatable.
If it is meant that neural nets cannot apply norms because of their normative content, this is incorrect. To the extent that norms can be expressed in terms of cases or rules they can be modelled using a neural net. Any normative content in these cases or rules is irrelevant for this purpose. Indeed the very basis on which neural nets operate can be viewed as an application of the norm that 'like case should be decided alike.' If it is meant that norms cannot be expressed in terms of cases, but must be represented as rules then it still cannot be accepted that neural nets cannot model legal decisions. It is possible that localist neural nets [149] can be used to model norms.If it is meant that neural nets cannot apply norms because they have no normative content for the neural net itself then this is also debateable. This is tied to the question of whether neural nets and computers can think, which though beyond the scope of the paper is still an open question. [150] However, it may be true that the result received from a neural net cannot force a value decision. This touches on moral philosophical questions beyond the scope of this paper.

Back to Start
 

B Methodological concerns in the use of neural nets.

In addition to the general jurisprudential issues associated with neural nets so far discussed, the manner in which neural nets are actually implemented in legal applications has implications for the jurisprudential theory embodied in the system. [151] The two most troubling aspects of the many uses discussed in the previous chapter, are the use of hypotheticals to train neural nets and the manner in which contradictory data is dealt with.

1 Hypotheticals.

Neural nets rely on statistical analysis of the underlying data presented during training. Thus to create accurate models they require data that is statistically representative.
In their discussion of the uses of neural nets in law, Mital and Johnson note that much of the law remains unreported and that neither all possible nor anticipated situations have been covered even by unreported cases. [152] This presents significant problems for the creation of neural nets in law. As outlined in the introduction to neural networks, neural nets create generalisations from the information presented to them during their training. The quality of these generalisations (in respect to the degree they reflect the actual outcomes of cases) is dependent on the cases used to train the network. A lack of cases will lead to spurious generalisations. [153] The lack of reported cases with which to train neural nets has been reported by a number of researchers. [154] In an attempt to overcome this problem, researchers have resorted to creating hypothetical cases with which to train their neural nets. [155] It must be realised however, that once a neural net has been trained with any hypotheticals then, unless one subscribes to a critical theory of law, it is no longer a system that reasons solely with the law. It is an amalgam of the law and expert belief. This may or may not be problematic, depending on the purpose the system is designed to achieve.Training with hypotheticals is said to incorporate the heuristic knowledge of the domain expert. [156] It seems justifiable to argue that predicted case outcomes generated by an expert do incorporate heuristic knowledge about how an expert would reason with a sparse set of cases, but the conclusion remains that any result reached by the system is not based solely on the law.If it is decided to use hypotheticals, it seems necessary that they at least be generated by a domain expert. In Bench-Capon's neural net, all the hypotheticals were generated by another computer program. [157] At the very least the rules in this program should be generated by a domain expert, but in such a case the question arises why these rules aren't simply incorporated into a symbolic reasoner.Philipps has argued that his neural net need only be trained with ten training examples, as long as they are 'prototypical'. [158] However, while this may be true in the case of the simple rules that were there modelled, it seems difficult to apply to neural nets trained with cases and designed to resolve open texture.The lack of training data needs to be addressed if neural nets are to be created that reason with the law.

Back to Start
 

2 Contradictory input data.

A further problem facing the use of neural nets in law is the way in which they model conflicting data.
Neural nets model conflicts in data by reaching a compromise between the conflicts. As previously argued, this may or may not be jurisprudentialy acceptable.If modelling conflict through compromise is jurisprudentially unacceptable then in the training of the neural net it is extremely important to ensure that contradictory examples are not included in the training set. This is a huge practically difficulty. It is unclear how to determine whether two cases are conflicting if those cases differ in more than one aspect. Unless techniques can be created to determine where contradictions exist in a training set, doubt must be cast on the practical possibility of using neural nets to reason with the law.Finally, in contrast to the work of researchers in legal expert systems and legal information retrieval systems, who are concerned with the practical uses of neural nets, several jurisprudes have also proposed using neural nets as a metaphor for the operation of law in society; as a representation of the interaction between the legislature, the courts and citizens. [159] A detailed description of such a neural net theory of law has been given by Silverman, [160] who sees the law as a huge neural net in which

The judges and other legal actors are nodes of the network; the published case reports and statutes, teaching in the law schools, continuing education courses and learning on the job, and the informal and formal oral communications among the members of the legal community are the connections between nodes; the cases and statutes themselves are the patterns presented to and learned by the network. [161]

This is a descriptive theory of law [162] that sits between positivist and critical theories of law. The law is not the application of objective facts but nor is it merely the preferences of individual judges. [163] Instead no single actor or single rule determines the outcome of a case; the outcome emerges from the interaction of the whole system. [164] Similarly, under this theory rules and theories of law are to be regarded only as approximations of the underlying law, much as a neural net constructs a mathematical function to approximate the distinctions present in its input data. [165]
While the practical implementation of such a neural net is far beyond current capabilities, this is not Silverman's aim. According to Silverman

At the most general level, our metaphor of law matters.... new metaphors of law can lead to an increased awareness of alternatives for the legal system. [166]

With a new metaphor the way we think about judges, law, society, and our role therein can radically change.

VI CONCLUSION.

Resurgent interest in neural nets has resulted in various applications in law. While neural nets do not have the potential to solve all problems present in current efforts to computerise legal knowledge, the use of neural nets does offer potentially great benefits in both the creation of legal expert systems and legal information retrieval systems.Most promising is the ability of neural nets to aid in the determination of similarity between cases. The finding of similarity is a key step in the process of legal reasoning. Any legal expert system that seeks to model legal knowledge has to incorporate a means to determine similarity.Neural nets offer a model of similarity that is more flexible than those found in existing symbolic reasoning systems and so have huge potential for use in legal expert systems.Neural nets potentially offer other benefits, such as a means to model conflicts in rules and cases. Their ability to learn information adds further attraction to their use.However, using neural nets to model conflict and to learn information has jurisprudential implications. The need for statistically significant numbers of cases with which to train neural nets, the jurisprudential implications of using hypotheticals during training, the need to ensure that training data is not contradictory and the currently limited ability of neural nets to justify their responses, all limit the present useability of neural nets in legal expert systems. While techniques have been proposed that potentially overcome both problems of contradiction and the ability to justify conclusions, little work has been conducted into these techniques. It must be ensured that the jurisprudential implications associated with these limitations do not undermine the overall project in which the use of neural nets is playing a part. While neural nets can offer a new metaphor for law, it is only through future research that creates hybrid neural net/symbolic reasoning systems that we will truly be able to use computers to test the implications of our current jurisprudential theories.

Back to Start


Bibliography.>

'Artificial Intelligence a Debate' (1990) 262(1) Scientific American 19.

Kevin Ashley, 'Toward a Computational Theory of Arguing with Precedent' in Proceedings of the Second International Conference on Artificial Intelligence and Law (1989).

Trevor Bench-Capon, 'Neural Networks and Open Texture' Proceedings of the Fourth International Conference on Artificial Intelligence and Law (1993) 292.

Robert Birmingham 'A Study After Cardozo: De Cicco v Schweizer, Noncooperative Games, and Neural Computing' (1992) 47 University of Miami Law Review 121.

Laurent Bochereau, Daniele Bourcier and Paul Bourgine, 'Extracting Legal Knowledge by Means of a Multilayer Neural Network Application to Municipal Jurisprudence' in Proceedings of the Third International Conference on Artificial Intelligence and Law 288.

Steven Burton, An Introduction to Law and Legal Reasoning (1985).

Steven Burton, 'Reaffirming Legal Reasoning: The Challenge from the Left' (1986) 36 Journal of Legal Education 358.

Maureen Caudill and Charles Butler, Naturally Intelligent Systems (1990).

Maureen Caudill and Charles Butler, Understanding Neural Networks: Computer Explorations (1992).

Rupert Cross and J Harris, Precedent in English Law (1991).Margaret Davies, Asking the Law Question (1994).

Daniel Dennett, Consciousness Explained (1991).

Paul Edwards (ed), The Encyclopedia of Philosophy (1967).

Stephen Gallant, Neural network learning and expert systems (1993).

Anne Gardner, An Artificial Intelligence Approach to Legal Reasoning (1987).

Martin Golding, Legal Reasoning (1984).

James Gordley, 'Legal Reasoning: An Introduction' (1984) 72 California Law Review 139.

K Hamilton, 'Prolegomenon to Myth and Fiction in Legal Reasoning, Common Law Adjudication and Critical Legal Studies' (1989) 35 The Wayne Law Review 1449.

H.L.A. Hart, The Concept of Law (1961).

John Hobson and David Slee, 'Indexing the Theft Act 1968 for Case Based Reasoning [CBR] and Artificial Neural Networks [ANNs]' in Proceedings of the Fourth National Conference on Law, Computers and Artificial Intelligence (1994) unnumbered additions.

Alan Hunt, 'The Big Fear: Law Confronts Postmodernism.' (1990) 35 McGill Law Journal 507.

Daniel Hunter, Alan Tyree and John Zeleznikow, 'There is less to this argument than meets the eye.' (1993) 4 Journal of Law and Information Science 46.

Jorgen Karpf, 'Inductive Modelling in Law: Example Based Expert Systems in Administrative Law' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 297.

Duncan Kennedy, 'Freedom and Constraint in Adjudication: A Critical Phenomenology' (1986) 36 Journal of Legal Education 518.

Raymond Kurzweil, The Age of Intelligent Machines (1990).

Kenneth Lambert and Mark Grunewald, 'Legal Theory and Case-Based Reasoners: The Importance of Context and the Process of Focusing.' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 191.

Edward Levi, An Introduction to Legal Reasoning (1948).Lloyd of Hampstead, Lloyd's Introduction to Jurisprudence (1985).

Neil MacCormick, Legal Reasoning and Legal Theory (1978).

V Mital and L Johnson, Advanced Information Systems for Lawyers (1992).

Robert Moles, 'Logic Programming - An Assessment of its potential for Artificial Intelligence Applications in Law' (1991) 2 Journal of Law and Information Science 137.

Robert Moles and Surendra Dayal, 'There is more to life than logic' (1993) 3 Journal of Law and Information Science 188.

James Murray, 'The Role of Analogy in Legal Reasoning' (1982) 29 University of California Law Review 833.

Lothar Philipps, 'Distribution of Damages in Car Accidents Through the Use of Neural Networks' (1991) 13 Cardozo Law Review 987.

Daniel Rose and Richard Belew, 'A connectionist and symbolic hybrid for improving legal research.' (1991) 35 International Journal of Man-Machine Studies 1.

Daniel Rose and Richard Belew, 'Legal Information Retrieval: A Hybrid Approach' in Proceedings of the Second International Conference on Artificial Intelligence and Law (1989) 138.

David Rumelhart, James McClelland and the PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition (1986).

M Sergot, F Sadri, R Kowalski, F Kriwaczek, P Hammond and T Cory, 'The British Nationality Act as a Logic Program' (1986) 29 Communications of the ACM 370.

Alexander Silverman, Mind, Machine, and Metaphor: An Essay on Artificial Intelligence and Legal Reasoning (1993).

Joseph Singer, 'The Player and the Cards: Nihilism and Legal Theory' (1985) 94 The Yale Law Journal 1.

John Stick, 'Can Nihilism be Pragmatic ?' (1987) 100 Harvard Law Review 332.

Julius Stone, Legal System and Lawyer's Reasonings (1964).

Cass Sunstein, 'On Analogical Reasoning' (1993) 106 Harvard Law Review 741.

Richard Susskind, 'Expert Systems in Law: A Jurisprudential Approach to Artificial Intelligence and Legal Reasoning' (1986) 49 The Modern Law Review 168.

Paul Thagard, 'Connectionism and Legal Inference' (1991) 13 Cardozo Law Review 1001.

Paul Thagard, 'Explanatory coherence' (1989) 12 Behavioural and Brain Sciences 435.

Celeste Tito, 'Artificial Intelligence: Can Computers Understand Why Two Legal Cases Are Similar ?' (1987) 7 Computer/Law Journal 409.

Alan Tyree, Expert Systems in Law (1989).

G van Opdorp, R Walker, J Schrickx, C Groendijk and P van den Berg, 'Networks at Work: a connectionist approach to non-deductive legal reasoning' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 278.

R Walker, A Oskamp, J Schrickx, G Van Opdorp and P van den Berg, 'PROLEXS: creating law and order in a heterogenous domain' 35 (1991) International Journal of Man-Machines Studies 35.
David Warner, 'A Neural Network Based Law Machine: Initial Steps' (1992) 18 Rutgers Computer and Technology Law Journal 51.

David Warner, 'A Neural Network-based Law Machine: the problem of legitimacy.' (1993) 2(2) Law Computers & Artificial Intelligence 135.

David Warner, 'The Role of Neural Networks in the Law Machine Development' (1990) 16 Rutgers Computer and Technology Law Journal 129.

David Warner, 'Toward a Simple Law Machine' (1989) 29 Jurimetrics 451.

John Zeleznikow and Daniel Hunter, Building Intelligent Legal Information Systems - Representation and Reasoning in Law (1994).

John Zeleznikow and Daniel Hunter, 'Rationales for the Continued Development of Legal Expert Systems' (1992) 3 Journal of Law and Information Science 94.

John Zeleznikow, George Vossos and Daniel Hunter, 'The IKBALS Project: Multi-Modal Reasoning in Legal Knowledge Based Systems' (1993) 2 Artificial Intelligence and Law 169.

Back to Start
 


Notes

[Note 1] Raymond Kurzweil, The Age of Intelligent Machines (1990) 16-8.

[Note 2] See John Zeleznikow and Daniel Hunter, Building Intelligent Legal Information Systems - Representation and Reasoning in Law (1994) ch 6 for an introduction to symbolic reasoning using rules and logic.

[Note 3] Kurzweil, above n 1.

[Note 4] A 'domain expert' is an expert in the domain in which the expert system is sought to be constructed. A knowledge engineer is someone who works with the domain expert to collect that experts knowledge and assemble it for use in the legal expert system: Kurzweil, above n 1.

[Note 5] Refer generally to Maureen Caudill and Charles Butler, Naturally Intelligent Systems (1990). For a detailed discussion of neural-net concepts and theory see David Rumelhart, James McClelland and the PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition (1986). For a 'hands on' introduction the computer package of Maureen Caudill and Charles Butler, Understanding Neural Networks: Computer Explorations (1992) is useful.

[Note 6] Caudill and Butler, 'Naturally intelligent systems', above n 5, ch 1.

[Note 7] The classic example of the problem of linear separability is the XOR problem: ibid 173-4. In the graph below, it is not possible to draw a single straight line that separates all the O's and all the X's, thus they are not linearly separable.
image 2

[Note 8] See the discussion of Kolmogorov's theorem: ibid 174-7.

[Note 9] 'Back-propagation' is a technique whereby the error made by a neural net in classifying a pattern can be progressively reduced, so that it reaches an 'acceptable' level: ibid ch 14.

[Note 10] Supervised learning is a procedure for training a neural net in which the neural net is presented with an input pattern and the output that is desired when that pattern is presented to the neural net. The neural net learns to associate the input pattern with the output pattern. The learning is 'supervised' because the creator of the neural net must present the two patterns to the network and also oversee that learning is occurring correctly. An obvious requirement of such training is that for every input pattern there must be a known output pattern; this is impossible in some environments, including some legal applications: ibid.

[Note 11] In this paper, 'lawyer' is used widely to refer to those who are involved in reasoning with and applying the law. Thus it would include judges, solicitors, barristers and legal academics, see Zeleznikow and Hunter, above n 2, ch 2.

[Note 12] Eg Paul Edwards (ed), The Encyclopedia of Philosophy (1967).

[Note 13] Neil MacCormick, Legal Reasoning and Legal Theory (1978) 21-4.

[Note 14] Martin Golding, Legal Reasoning (1984) 43-4. It is always possible that a factual situation will arise which has an outcome different to that previously observed, thus invalidating the rule founded on the earlier situations.

[Note 15] James Murray, 'The Role of Analogy in Legal Reasoning' (1982) 29 University of California Law Review 833, 849.

[Note 16] Golding, above n 14, 44 states that induction is simply another form of analogy. While they are closely related there is a difference between analogy and induction, outlined below.

[Note 17] MacCormick, above n 13, 163.

[Note 18] MacCormick, above n 13, 229.

[Note 19] MacCormick, above n 13; Zeleznikow and Hunter, above n 2, ch 4 pp 9-16. The first of these views is an extreme version of the legal positivism of H.L.A. Hart, The Concept of Law (1961). The second view is an extreme version of the arguments presented by the American legal realists and members of the critical legal studies and post modernist movements eg Margaret Davies, Asking the Law Question (1994); Alan Hunt, 'The Big Fear: Law Confronts Postmodernism.' (1990) 35 McGill Law Journal 507.

[Note 20] MacCormick, above n 13, 197. However, see Joseph Singer, 'The Player and the Cards: Nihilism and Legal Theory' (1985) 94 The Yale Law Journal 1 and the reply John Stick, 'Can Nihilism be Pragmatic ?' (1987) 100 Harvard Law Review 332.

[Note 21] Edward Levi, An Introduction to Legal Reasoning (1948) 1. Steven Burton, An Introduction to Law and Legal Reasoning (1985) 26-39 gives a similar taxonomy. Levi's view has been criticised by Murray, above n 15, 848-50; however for present purposes this criticism is not important.

[Note 22] MacCormick, above n 13, ch. 5 et seq; Burton, above n 21, 103.

[Note 23] MacCormick, above n 13, 103.

[Note 24] MacCormick, above n 13, ch. 5, ch. 7; Duncan Kennedy, 'Freedom and Constraint in Adjudication: A Critical Phenomenology' (1986) 36 Journal of Legal Education 518

[Note 25] MacCormick, above n 13, ch 2; Burton, above n 21; Julius Stone, Legal System and Lawyer's Reasonings (1964) chs 6,7; Cf Lloyd of Hampstead, Lloyd's Introduction to Jurisprudence (1985) 1139 footnote 95 for a list of authorities who deny deduction plays a role in legal reasoning.

[Note 26] MacCormick, above n 13, 19.

[Note 27] Burton, above n 21, 44-50; Stone, above n 25, 55-8.

[Note 28] Levi, above n 21, 2. Similarly Lloyd notes that it has 'long been accepted that a case only binds as to "like facts". But what are like facts ...' Lloyd, above n 25, 1116. While given as a discussion of 'rules' in the common law, this is equally applicable to statutory rules.

[Note 29] MacCormick, above n 13, 66.

[Note 30] MacCormick, above n 13; Levi, above n 21; Burton, above n 21.

[Note 31] Rupert Cross and J Harris, Precedent in English Law (1991).

[Note 32] MacCormick, above n 13, 219-24; Stone, above n 25, 267-74. Indeed Stone regards the multitude of ratios that exist in a decision as requiring extreme scepticism about the ability of computers ever to reason with cases, ibid 37-8; Cf Cross, above n 31.

[Note 33] This results from the need for reality and coherence in the legal system, MacCormick, above n 13, ch. 7.

[Note 34] MacCormick, above n 13, 216-8.

[Note 35] In truth though, common law rules only serve to hide the cases underlying the supposed rule and to mask the reaching of a decision by analogy. Burton, above n 21, 60; Levi, above n 21, 8-9.

[Note 36] Cross, above n 31, 191-2; Levi states that thinking of case-law reasoning as inductive is erroneous. However, he agrees that case law concepts can be created out of particular instances, since there is movement from the particular to the general: Levi, above n 21, 27.

[Note 37] MacCormick, above n 13, ch. 7.

[Note 38] MacCormick, above n 13; Levi, above n 21; Burton, above n 21; Stone, above n 25; Murray, above n 15; James Gordley, 'Legal Reasoning: An Introduction' (1984) 72 California Law Review 139; Cass Sunstein, 'On Analogical Reasoning' (1993) 106 Harvard Law Review 741.

[Note 39] Eg Stone, above n 25, 283.

[Note 40] Steven Burton, 'Reaffirming Legal Reasoning: The Challenge from the Left' (1986) 36 Journal of Legal Education 358; Cf Singer, above n 20 with Stick, above n 20.

[Note 41] K Hamilton, 'Prolegomenon to Myth and Fiction in Legal Reasoning, Common Law Adjudication and Critical Legal Studies' (1989) 35 The Wayne Law Review 1449.

[Note 42] An 'inference engine' is a part of an expert system that is 'a system for applying the rules [of the system's database] to the knowledge base to make decisions.': Kurzweil, above n 1, 293.

[Note 43] Stephen Gallant, Neural network learning and expert systems (1993) ch 14.

[Note 44] This definition is adapted from that provided in Zeleznikow and Hunter, above n 2, ch 5. The authors note that the actual task that a legal expert system will perform varies markedly according to its intended user.

[Note 45] Eg, Kenneth Lambert and Mark Grunewald, 'Legal Theory and Case-Based Reasoners: The Importance of Context and the Process of Focusing.' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 191.

[Note 46] Case based reasoners are expert systems that try to reason using a corpus of cases rather than explicit rules: Zeleznikow and Hunter, above n 2, ch 8.

[Note 47] John Zeleznikow, George Vossos and Daniel Hunter, 'The IKBALS Project: Multi-Modal Reasoning in Legal Knowledge Based Systems' (1993) 2 Artificial Intelligence and Law 169, 171-2.

[Note 48] For example, if the database contained the two rules 'As between two innocents he who caused the damage should pay' and 'No liability without fault' it is unclear how these rules are both to be applied in a no-fault accident. The resolution of this conflict must be resolved by reference to other tests.

[Note 49] Davies, above, n 19, ch 7 demonstrates how both Hart's concept of a 'rule of recognition' and Kelsen's concept of a 'grundnorm' would necessarily import 'extra-legal' assumptions into the legal system.

[Note 50] Refer to ch 2 for a discussion of this problem in symbolic reasoners.

[Note 51] David Warner, 'A Neural Network Based Law Machine: Initial Steps' (1992) 18 Rutgers Computer and Technology Law Journal 51, 51-4; David Warner, 'The Role of Neural Networks in the Law Machine Development' (1990) 16 Rutgers Computer and Technology Law Journal 129, 139.

[Note 52] All neural nets operate as some form of pattern classifier. They learn to associate certain general input patterns with certain general output patterns, see ch 2 of this paper.

[Note 53] John Hobson and David Slee, 'Indexing the Theft Act 1968 for Case Based Reasoning [CBR] and Artificial Neural Networks [ANNs]' in Proceedings of the Fourth National Conference on Law, Computers and Artificial Intelligence (1994) unnumbered additions.

[Note 54] Success in this respect must be understood to mean performing a classification, according to the index points chosen by the creators, which is the same as that which the creators would arrive at using those same index points.

[Note 55] Trevor Bench-Capon, 'Neural Networks and Open Texture' Proceedings of the Fourth International Conference on Artificial Intelligence and Law (1993) 292.

[Note 56] G van Opdorp, R Walker, J Schrickx, C Groendijk and P van den Berg, 'Networks at Work: a connectionist approach to non-deductive legal reasoning' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 278.

[Note 57] Bench-Capon, above n 55, 296 While the ability of a neural net to classify patterns even in the presence of noise is notable, as will be discussed in the next chapter, defining input as 'noise' is dependent on a pre-existing theory of the domain. This may be problematic.

[Note 58] R Walker, A Oskamp, J Schrickx, G Van Opdorp and P van den Berg, 'PROLEXS: creating law and order in a heterogenous domain' 35 (1991) International Journal of Man-Machines Studies 35; van Opdorp, above n 56.

[Note 59] Walker, above n 58, 55-6.

[Note 60] Ibid 56.

[Note 61] Ibid 56-7; van Opdorp, above n 56, 280-1.

[Note 62] van Opdorp, above n 56, 281-4.

[Note 63] Ibid 280-1.

[Note 64] Bench-Capon, above n 55, 297.

[Note 65] M Sergot, F Sadri, R Kowalski, F Kriwaczek, P Hammond and T Cory, 'The British Nationality Act as a Logic Program' (1986) 29 Communications of the ACM 370.

[Note 66] Refer to the debate conducted in the following articles Robert Moles, 'Logic Programming - An Assessment of its potential for Artificial Intelligence Applications in Law' (1991) 2 Journal of Law and Information Science 137; John Zeleznikow and Daniel Hunter, 'Rationales for the Continued Development of Legal Expert Systems' (1992) 3 Journal of Law and Information Science 94; Robert Moles and Surendra Dayal, 'There is more to life than logic' (1993) 3 Journal of Law and Information Science 188; Daniel Hunter, Alan Tyree and John Zeleznikow, 'There is less to this argument than meets the eye.' (1993) 4 Journal of Law and Information Science 46.

[Note 67] It is unclear what use Hobson and Slee intend for their 'index'. If it is truly meant to be used as an index of cases then their treatment of open textured issues is less questionable than if they intend it to be used within a legal expert system.

[Note 68] David Warner, 'The Role of Neural Networks in the Law Machine Development' (1990) 16 Rutgers Computer and Technology Law Journal 129, 135-38. The claim that neural nets can inherently model the process of legal reasoning will be critically discussed in the next chapter.

[Note 69] Lothar Philipps, 'Distribution of Damages in Car Accidents Through the Use of Neural Networks' (1991) 13 Cardozo Law Review 987.

[Note 70] Philipps, above n 69, 989-91, 999.

[Note 71] Paul Thagard, 'Explanatory coherence' (1989) 12 Behavioural and Brain Sciences 435; Paul Thagard, 'Connectionism and Legal Inference' (1991) 13 Cardozo Law Review 1001. On creating inference networks generally using neural nets, see Gallant, above n 43, chs 14, 15.

[Note 72] Though the system could logically be used to resolve conflicts between single rules.

[Note 73] ECHO requires competing hypothesis to be given to the system, along with evidence, details of how each hypothesis explains the evidence and details of how the propositions are contradicted by the evidence. ECHO then determines which hypothesis best coheres with the evidence Thagard, 'Explanatory coherence', above n 71.

[Note 74] Interestingly Stick, above n 20, 363 notes that many contemporary theories of law are based upon coherence theories of truth. Thagard's ECHO could be useful in investigating such theories.

[Note 75] Richard Susskind, 'Expert Systems in Law: A Jurisprudential Approach to Artificial Intelligence and Legal Reasoning' (1986) 49 The Modern Law Review 168, 184.

[Note 76] Robert Birmingham 'A Study After Cardozo: De Cicco v Schweizer, Noncooperative Games, and Neural Computing' (1992) 47 University of Miami Law Review 121 discusses this. A re-rationalisation of a case occurs when a later case rationalises the decision in an earlier case on grounds that are different from those stated in the judgement of the earlier case.

[Note 77] This will be discussed further in ch 5 of this paper.

[Note 78] For example Bench-Capon, above n 55; van Opdorp et al, above n 56; Gallant, above n 43. This will be critically discussed in the following chapter.

[Note 79] Laurent Bochereau, Daniele Bourcier and Paul Bourgine, 'Extracting Legal Knowledge by Means of a Multilayer Neural Network Application to Municipal Jurisprudence' in Proceedings of the Third International Conference on Artificial Intelligence and Law 288. Similar work has been undertaken by Bench-Capon, above n 55, 296. For a detailed discussion of the extraction of rules from neural nets generally, see Gallant, above n 43, ch 17.

[Note 80] Zeleznikow and Hunter, above n 2, ch 11, p 20.

[Note 81] This will be discussed in ch 5.

[Note 82] For a comprehensive discussion of legal information retrieval systems and methods and their associated limitations see Zeleznikow and Hunter, above n 2, ch 3.

[Note 83] For example, if a document is indexed on the term 'solicitor' then searching for 'lawyer' will not retrieve it, even though it may be relevant. While this problem can be reduced using a search on all synonyms this does not guarantee all relevant documents will be retrieved: ibid.

[Note 84] Daniel Rose and Richard Belew, 'Legal Information Retrieval: A Hybrid Approach' in Proceedings of the Second International Conference on Artificial Intelligence and Law (1989) 138; Daniel Rose and Richard Belew, 'A connectionist and symbolic hybrid for improving legal research.' (1991) 35 International Journal of Man-Machine Studies 1.

[Note 85] For a discussion of semantic networks and knowledge representation generally see Zeleznikow and Hunter, above n 2, ch 7.

[Note 86] Rose and Belew, 'Legal Information Retrieval', above n 84, 141.

[Note 87] Rose and Belew, 'A connectionist and symbolic hybrid', above n 84, 20-2.

[Note 88] Ibid 22.

[Note 89] Ibid 29-30.

[Note 90] Susskind, above n 75.

[Note 91] David Warner, 'Toward a Simple Law Machine' (1989) 29 Jurimetrics 451; David Warner, 'A Neural Network Based Law Machine: Initial Steps' (1992) 18 Rutgers Computer and Technology Law Journal 51; Warner, 'The role of neural networks', above n 68.

[Note 92] Warner, 'Toward a Simple Law Machine', above n 91, Part 5; Warner, 'The role of neural networks', above n 68, 131-2.

[Note 93] Warner, 'A neural network based law machine', above n 91, 53.

[Note 94] Ibid 53-4.

[Note 95] Warner, 'The role of neural networks', above n 68, 132.

[Note 96] See Zeleznikow and Hunter, above n 2, ch 6; Alan Tyree, Expert Systems in Law (1989) for a discussion of representing laws using logic and tree diagrams.

[Note 97] Walker et al, above n 58.

[Note 98] Bench-Capon, above n 55.

[Note 99] Warner, 'The role of neural networks', above n 68, 138-9.

[Note 100] Ibid 139.

[Note 101] Anne Gardner, An Artificial Intelligence Approach to Legal Reasoning (1987).

[Note 102] Warner, 'The role of neural networks', above n 68, 139.

[Note 103] Heuristics are 'rules of thumb' used by experts in a field Gardner, above n 101, 41-3.

[Note 104] Warner, 'The role of neural networks', above n 68, 139.

[Note 105] Ibid 139.

[Note 106] Kennedy, above n 24; Jorgen Karpf, 'Inductive Modelling in Law: Example Based Expert Systems in Administrative Law' in Proceedings of the Third International Conference on Artificial Intelligence and Law (1991) 297, 300 notes that only certain combinations of factors are legal combinations.

[Note 107] Eg Kennedy, above n 24.

[Note 108] Birmingham, above n 76, 132-4.

[Note 109] MacCormick, above n 13.

[Note 110] Susskind, above n 75, 183.

[Note 111] MacCormick, above n 13; Levi, above n 21; Burton, above n 21; Gordley, above n 38; Sunstein, above n 38; Murray, above n 15; Kevin Ashley, 'Toward a Computational Theory of Arguing with Precedent' in Proceedings of the Second International Conference on Artificial Intelligence and Law (1989). However, the finding of similarity between cases is a prerequisite to any subsequent manipulations of the analogy.

[Note 112] V Mital and L Johnson, Advanced Information Systems for Lawyers (1992), 257; Celeste Tito, 'Artificial Intelligence: Can Computers Understand Why Two Legal Cases Are Similar ?' (1987) 7 Computer/Law Journal 409, 411-2 agrees with Mital and Johnson. After noting the importance of similarity Burton, above n 21, 39 simply says the process is a mystery.

[Note 113] Mital and Johnson, above n 112, 257. The authors note that this has been criticised because it would mean that people would not be able to say why or in what aspects two cases are similar; which seems unrealistic.

[Note 114] Ibid.

[Note 115] Ibid.

[Note 116] Tito, above n 112. The need for a moral theory is echoed by MacCormick, above n 13, chs 5, 7 who argues that the finding of an analogy is dependent on our view of what purpose the legal system is trying to achieve. Sunstein, above n 38, 773-81 at footnote 116, also notes the need for a general theory with which to evaluate similarities and thinks this should cause scepticism about efforts to program computers to engage in analogical reasoning.

[Note 117] Tito, above n 112.

[Note 118] For a concise discussion on this issue see the debate between John Searle, and Paul Churchland and Patricia Churchland in 'Artificial Intelligence a Debate' (1990) 262(1) Scientific American 19. The discussion of neural nets is particularly interesting.

[Note 119] Karpf, above n 106, 299.

[Note 120] Accurately is here being taken to mean : model to a degree of richness that is sufficient to satisfy lawyers.

[Note 121] See ch 2 for discussion of this theorem.

[Note 122] Bench-Capon, above n 55.

[Note 123] Ibid 296-7.

[Note 124] Kennedy, above n 24.

[Note 125] Refer to ch 4 for a discussion of the operation of SCALIR.

[Note 126] Walker et al, above n 58, 63; Birmingham, above n 76,132-4.

[Note 127] Levi, above n 21; MacCormick, above n 13; Burton, above n 21; Gordley, above n 38; Sunstein, above n 38; Murray, above n 15; Ashley, above n 111.

[Note 128] Mital and Johnson, above n 112, 256.

[Note 129] Bochereau et al, above n 79; A modification of this approach is presented in David Warner, 'A Neural Network-based Law Machine: the problem of legitimacy.' (1993) 2(2) Law Computers & Artificial Intelligence 135, 141. The difference between the two approaches essentially concerns when the rules are to be generated. Bochereau sees neural nets as being used specifically to extract rules while Warner argues that rules will be extracted at run time in response to questioning. It is possible that the latter approach would provide more flexibility.

[Note 130] van Opdorp et al, above n 56, 285. Warner, 'A neural network based law machine: the problem of legitimacy', above n 129, 139. The difference between the two approaches is that in the latter the percentages of the input variables that can be attributed to the output variable is also determined.

[Note 131] van Opdorp et al, above n 56, 285

[Note 132] Ibid 285

[Note 133] Lambert and Grunewald, above n 45; Zeleznikow and Hunter, above n 2, ch 2.

[Note 134] Gallant, above n 43, ch 17.

[Note 135] Bench-Capon, above n 55, 296.

[Note 136] The search for complete explanations may be pointless as there are aspects of human action that humans cannot themselves explain: Daniel Dennett, Consciousness Explained (1991) 84-95 gives a good discussion of this.

[Note 137] van Opdorp, above n 56, 285.

[Note 138] See the discussion on the SPLIT-UP system in Zeleznikow and Hunter, above n 2, ch 11 p 20 and the associated references.

[Note 139] Zeleznikow and Hunter, above n 2, ch 11 p.20.

[Note 140] Philipps, above n 69, 999.

[Note 141] For example in R v Watson; Ex parte Armstrong (1976) CLR 249, the High Court was faced with conflicting lines of authority as to what amounted to judicial bias. In the result, one line of authority was totally rejected.

[Note 142] Mital and Johnson, above n 112, 259. However, Mital and Johnson indicate that a conflict between the interpretation of factors within a case may be solved by the use of compromise.

[Note 143] The practical implications of this will be discussed later.

[Note 144] Discussed in ch 4.

[Note 145] ECHO requires competing hypothesis to be entered into the neural net along with their supporting facts. How each hypothesis is supported by evidence and what evidence contradicts what hypothesis then has to be entered by the system designer. Such decisions can be highly subjective and it is unclear what implications these requirements could have on the use of ECHO in the legal domain.

[Note 146] Rose and Belew, 'A connectionist and symbolic hybrid', above n 84, 20-2.

[Note 147] Kennedy, above n 24.

[Note 148] Mital and Johnson, above n 112, 253.

[Note 149] Neural nets can be classified not only according to their learning rules and architectures, but also as distributed or localist networks. In distributed networks, of which adaptive filter networks are one type, only the nodes at the input and output levels represent real world concepts, hidden layers are simply there to aid in the mapping performed by the network. In localist models, each node of the network represents a real world concept.

[Note 150] Above n 118.

[Note 151] Apart from the discussion of the use of hypotheticals and compromise which follows, it should be noted that the study of neural nets is still a comparatively embryonic field. A multitude of network designs exist from which application developers can choose. Designs generally have numerous design parameters the values of which can be chosen more or less ad hoc. Both the type of network design used and the values of the parameters chosen, affect the behaviour of the neural net. How many hidden layers to include in the neural net and how many nodes to include in each of those layers, the learning rule and learning rate, the amount of noise present when the network is trained and even the order in which training examples are presented to the neural net can all affect the neural net's behaviour and the classifications it produces. Thus these factors can affect the way the neural net reasons with the information presented to it. However, the legal literature discussing neural nets does not discuss such issues; an exception being: Rose and Belew, above n 84.

[Note 152] Mital and Johnson, above n 112, 265.

[Note 153] van Opdorp et al, above n 56, 282-4.

[Note 154] Eg Karpf, above n 106; Hobson and Slee, above n 53; Walker et al, above n 58.

[Note 155] Hobson and Slee, above n 53, 12; Walker et al, above n 58, 57. In this context Bench-Capon, above n 55, also used hypotheticals to train his ANN, however this was dictated by the fact that his whole legal domain was hypothetical. This use of hypotheticals has been criticised Karpf, above n 106, 299.

[Note 156] van Opdorp, above n 56.

[Note 157] Bench-Capon, above n 55. Other researchers such as Hobson and Slee, above n 53, do not mention how their hypotheticals are generated.

[Note 158] Philipps, above n 69, 996.

[Note 159] Warner claims that this view is inherent in the works of Anthony D'Amato: Warner, 'The role of neural networks', above n 68, 138.

[Note 160] Alexander Silverman, Mind, Machine, and Metaphor: An Essay on Artificial Intelligence and Legal Reasoning (1993).

[Note 161] Ibid 80.

[Note 162] Ibid 81.

[Note 163] Ibid 80.

[Note 164] Ibid. An expanded version of this theory sees law not only as an interconnected network, but also as an interconnected network that 'resonates' with society; the law both influences and is influenced by the society in which it is constructed: ibid 84-6.

[Note 165] Ibid 81-3.

[Note 166] Ibid 94-5.

Back to Start