Lists and why they are useful By M. V. Wilkes* Computers have long been in general use for solving numerical problems, and pioneering interest has now switched to their use for non-numerical work, that is, for manipulating symbols. Examples are compiling, studies in artificial intelligence, layout problems, etc. List-processing was a breakthrough in symbol manipulation since it provided aflexibleway of organizing the computer memory. This paper explains in an expository manner what goes on in the computer memory when list-processing operations are performed, and takes as an example the formal differentiation of an algebraic expression written in Polish notation. equal parts and, except for the end register of a list, the second part contains the address of the next register in the sequence; the second part of the end register contains an indicating symbol here taken to be 0. Note that the registers do not have to be consecutive in the memory; herein lies the merit of the system since an extra register can easily be inserted at any point in a list, without disturbing the others, simply by acquiring a register so far unused, placing its address in the second part of the register after which it is to be inserted, and putting in its second half the address of the following register. So that it shall be easy to find a free register, all available registers are linked together on what is known as the free list; this is a list like any other list, and registers are taken from it when required and returned to it when no longer required. I shall follow McCarthy and refer to the first half of a register as the CAR of that register and to the second half as the CDR (pronounced "cudder"). In a simple list such as has just been described, each CDR contains an address pointing to the next register in the list, while the CAR is free and may be used to hold a symbol. For example, Fig. 1 shows a list representation of the mathematical expression A + B. An alternative representation is given in Fig. 2, which shows the same expression in Polish notation with the sign coming before the operands. The CAR of a register may alternatively contain an address and thus point to a sub-list. We then have a list structure. An example is given in Fig. 3 which is Ten years ago much effort was being devoted to the use of computers for solving numerical problems. This subject is now well advanced, and pioneering interest has switched to the use of computers for solving nonnumerical problems, that is, for manipulating symbols. Programming advances often follow on the introduction of some technical device that facilitates the organization of, or the cross-referencing of, the computer memory. For example, the first programmers wrote all their addresses in absolute form and were forced to re-number when insertions were made in the program; the introduction of floating or symbolic addresses, which were replaced automatically by absolute addresses when the program was assembled, was the first step towards freeing the programmer from limitations imposed by the consecutive nature of a computer memory. The introduction of lists and list structures was a further step in the same direction. Technical devices such as those mentioned are often so successful that the modern programmer is unaware that any difficulty ever existed. This is particularly so as he is screened by developments in programming languages from what is going on in the computer. In the more highly developed list-processing languages, for example, the programmer is insulated from the details of the list-manipulating operations that are brought about for him by the system; LISP (McCarthy, 1962) in particular is "mathematician oriented" and appeals by its formal qualities to those who have been trained in the rigour of abstract thought. The subject of this paper, however, is "Lists and why they are useful," and not "List-processing languages and how to use them." It will, therefore, be concerned with the details of what is going on in the memory of the computer when listprocessing operations are performed. For this purpose it is necessary to use a simple language in which the operations can be followed in detail; the one that will be used here may be described as an assembly language for lists. Fig. 1. List representation of A + B \A\ |-^|fl|~0l Fig. 2. A + B in Polish notation Lists The word list is used in a very technical sense, and I- I I - F T ~ H Q I O| refers to a sequence of memory registers strung together Fig. 3. A +P.Q in a particular way. Each register is divided into two * Director, University Mathematical Laboratory, Corn Exchange St., Cambridge. • 278 Lists Base registers Fig. 6. List structures showing base registers, CARS from which no arrows start contain arbitrary symbols Fig. 5. /< + (C + Z>)2 derived from Fig. 2 by replacing B by a list representing P.Q. The list structure thus stands for A + P.Q. Expressions of any complexity may be handled in this way, and changes may easily be made to their component parts. Further examples are shown in Figs. 4 and 5. Note that in the latter there is a common sub-list. A list must start somewhere, and the address of its first register is stored in one of a sequence of fixed memory registers known as base registers. These registers may be given names, and these names may be used also to refer to the lists which start from them. Here the names used will be capital letters with or without suffixes. Base registers can also point to sub-lists forming parts of list structures; in this way sub-lists may be given names. Examples of list structures showing how they are connected to base registers are given in Fig. 6. Normally, it is not necessary to show the base registers in diagrams of lists, and the example shown would normally appear as in Fig. 7. Atoms Basic symbols, such as A, B, C, . . ., or A\, A2, A$, . .., are referred to as atoms. In Fig. 7, CAR A is atomic, whereas CAR C is non-atomic. The following relationships hold between the lists in Fig. 6 or Fig. 7: C = B, D = CAR B, E = CDR D. Statements of this type may also be regarded as commands in a program. For example, the statement C = B implies that C is to become an alternative name for the list B; its programming significance is to copy the content of the base register corresponding to B into the base register corresponding to C. Similarly, E = CDR D associates the name E with the list whose first member is the second member of the list D. If we start with a configuration of Fig. 7 and execute the following statements R = CDR C R = CDR R CAR R = A we arrive at the configuration shown in Fig. 8, in which the list A is now a sub-list of B. 279 Fig. 7. As Fig. 6 with base registers not shown R ^Li—rur —m-m-m-mFig. 8. The result of performing the following operations on the list structures shown in Fig. 7: R = CDR C, R = CDR R, CAR R — A It will be noted that symbols drawn from the same alphabet have been used both as the symbols being manipulated and as the names of lists. One meets similar situations in other formal systems, and it is necessary to distinguish between cases in which symbols are the names of other entities and cases in which they stand for nothing but themselves. For example, in ordinary mathematics, we might have z = y2 + 1 where y = x -j- = 2x ~> > - = 0- ax ox In the case of the ordinary derivative, y is regarded as standing for x whereas, in the case of the partial derivative, y is regarded as standing for itself. It might be thought that in list processing the difficulty could be avoided by having two sets of symbols, and using one for the names of lists and the other for the symbols being manipulated. It would soon be found, however, that this would not work, and that situations would arise when a symbol used as the name of a list had to be referred to in its own right. In what follows, symbols will be enclosed in quotation marks when they are to be regarded as standing for themselves. Lists symbols representing the program in source language must be accepted and processed so as to yield another stream of symbols representing the same program in target language. An example of how this is done may be found in Wilkes (1964), in which a simple list-processing language, very similar to the one used here, is implemented in terms of a compiler written in itself. The same language has been applied to a layout problem encountered in connection with the design of deposited wiring for high-speed computing circuits (Wiseman, 1964). Studies in artificial intelligence call for powerful listprocessing techniques, and involve such operations as the placing of items on lists, the searching of lists for items according to specified keys, and so on. Languages in which such operations can be easily specified are indicated. The pioneer language in this regard was IPL developed by Newell, Simon and Shaw (see Newell, 1961). Recursion In symbol manipulation much use is made of recursive subroutines. A recursive subroutine is one that can use itself, and for this to be possible two things are necessary. One is that, each time the subroutine is called in, it should make use of a different part of the memory for working space; the second is that the subroutine should communicate with the rest of the program through one or more stacks or their equivalent. A stack, or a pushdown list as it is sometimes called, works on the last in, first out, principle. One can place a new item on the top of the stack, or one can take off the item that happens at that moment to be at the top of the stack. It is possible to use a single stack to control a recursive subroutine. When the subroutine is called in, a link is first placed on the stack; this gives the point in the program to which control is to be returned when the subroutine has done its work. Next, the arguments are placed on the stack in order, and control is sent to the subroutine. During operation of the subroutine, the arguments and link are removed from the stack, and the results are placed there. Control is then returned to the calling-in program which takes the results from the stack. The state of the stack is now the same as it was before the subroutine was called in. Such a subroutine may call itself in during the course of the calculation, and each time it does so extra arguments and links get piled on the stack; if the subroutine has been correctly constructed, however, everything works out properly, the stack always being found to contain the right information at the right moment. Although one stack is sufficient, it is frequently convenient to use several; in particular, a separate stack is often used to contain the links. A stack can be conveniently and efficiently constructed by making use of a sequence of consecutive registers in the memory. If desired, however, a list of the list-processing kind can be pressed into service. For this purpose two operations are required. One, PUSH DOWN A, takes a register from the beginning of the free list and inserts it at the beginning of list A, and the other, POP UP A, performs the reverse operation. Thus, by writing PUSH DOWN A, CAR A = "X", one can put the symbol "X" on the stack formed by the list A. Similarly, by writing CAR B = CAR A, POP UP A, one can remove the symbol from the stack and place it in CAR B. The operation PUSH DOWN A can be expressed in terms of elementary list-processing operations in the manner shown below. F is the name of the free list. Example—formal differentiation An example frequently taken to illustrate symbol manipulation and the use of recursive subroutines is that of formal differentiation. A program is given below for differentiating an algebraic expression stored in the computer as a Polish list, and it is hoped that this example will help to make clear what has been said above. The operations allowed in the algebraic expression are addition, subtraction, and multiplication; division, exponentiation, and trigonometrical functions could easily have been included at the expense of making the program longer. Differentiation is with respect to X. The result is left in a rather rough form. For example, the result of differentiating X. Y is given as 1.7 + X.O. It is not difficult to write an editing routine which will remove the unnecessary ones and zeros, and the interested reader is referred to Wilkes (1964) where such a routine is given. The program is based on the following rules. In the first place, there is the rule for differentiating an expression consisting of a single symbol, namely, that the symbol should be replaced by " 1 " if it is "X", and by "0" otherwise. Secondly, there are rules for differentiating sums, differences, and products. These latter rules are always written recursively, that is, the symbol for differentiation appears on both sides of the equation. This recursive form is reflected in the differentiating routine which, at appropriate points, contains instructions calling in itself. To call in a routine it is necessary to put on list M, which is used as a stack, first, a list which will receive the result and, second, the Polish list to be differentiated. The beginning of the routine is given the label 10 and, after placing the appropriate quantities on the stack, the subroutine may be called in by the instruction TO 10 AND BACK which causes an appropriate link to be stored on a private stack inaccessible to the programmer. RETURN causes control to be sent to the place indicated D= F CDR F = CDR F D = A A = D. Uses of list processing List processing may be applied to any problem in which symbol manipulation is involved. An obvious example is the writing of a compiler. Here a stream of 280 Lists by the link standing at the top of this stack, and the stack to be popped up. Conditional statements are written in a form reminiscent of ALGOL with, however, round brackets instead of the words begin and end to enclose compound statements. F is the free list. It is hoped that with these few words of explanation the routine will prove comprehensible. TO 10 AND BACK, TO 10 AND BACK RETURN) OTHERWISE (CAR L2 = F Ry = CAR L2, R2 = CDR Ry, i?3 = CDR R2 F = CDR Ri, CDR R3 = " 0 " CAR Li = F Sy = CAR L^ S2 = CDR Sy, 5 3 = CDR S2 F = CDR Si, CDR Si = " 0 " CARL, = " + " , CAR Ry = " . " , CAR Sy = " . " CAR R2 = CAR 5 2 , CAR S3 = CAR 2?3 PUSH DOWN M, CAR M = S2 PUSH DOWN M , CAR M = B2 PUSH DOWN M , CAR M — Ri PUSH DOWN M , CAR M = i? 3 TO 10 AND BACK, TO 10 AND BACK RETURN) 10 A = CAR M, POP UP M D = CAR M, POP UP M IF CAR A — ATOM T H E N (IF CAR A = "X" THEN CAR D = " 1 " OTHERWISE CAR D = "0", RETURN) OTHERWISE (By = CAR A, B2 — CDR By, B3 = CDR B2 CAR D = F Ly = CAR D, L2 = CDR Ly, Li = CDR L2 / • = CDRZ,3, CDR£ 3 = "0") Acknowledgement This paper is based on an expository lecture given at the 1964 Meeting of the Association for Computing Machinery, held in Philadelphia. The paper was originally published in the Proceedings of that meeting, and I am grateful to the Association for permission to reprint. IF CAR By = " + " or " - " THEN (CAR Ly = CAR By PUSH DOWN M, CAR M = L2 PUSH DOWN M, CAR M = B2 PUSH DOWN M, CAR M = Z 3 PUSH DOWN M, CAR M = _B3 References MCCARTHY, J. et at. (1962). L/SP 1.5 Programmer's Manual, M.I.T. Press. NEWELL, ALLEN (Ed.) (1961). Information Processing Language-V Manual, The RAND Corp. WILKES, M. V. (1964). "An experiment with a self-compiling compiler for a simple list-processing language," Annual Review of Automatic Programming, Vol. 4, Pergamon Press, Oxford. WISEMAN, N. E. (1964). "Application of list-processing methods to the design of interconnections for a fast logic system," The Computer Journal, Vol. 6, p. 321. Correspondence The Editor, The Computer Journal, software/hardware combination to or from a common internal code. An essential corollary to such a system is the acceptance of the printed record as authoritative: a particular representation on tape or cards may be imposed by the limitations of peripheral equipment or transmission systems, but this is not relevant to the user, and he should not have to bother about it. The user should be able to say "print a letter A" or "tell me what the next character of the input stream is," leaving it to the system to sort out the details of the physical representation. Such facilities can be provided by a sophisticated software system, and they can be provided now,- with existing peripheral machines, not at some undetermined date in the future when we have all been standardized. Yours etc., Sir, May I make two comments on the article "The ISO character code" by H. McG. Ross, in the October Journal? Firstly, I am sorry to see that Backspace will be used "to prepare composite symbols and for underlining as in ALGOL." Although this use of backspace is common, it is far inferior to the use of a non-escaping underline, as anyone who has experienced both systems will testify. The labour of punching ALGOL programs is greatly reduced by the provision of a non-escaping key with vertical bar and underline, as on the MC-ALGOL Flexowriter used at the Mathematical Centre, Amsterdam. Secondly, I would propose the perhaps heretical view that standardization of character codes is not as important as is sometimes made out. Code translation is an easy process for a computer to carry out, and the computing system should be designed to deal with any code, translating by software or a D. W. BARRON. The University Mathematical Laboratory, Corn Exchange Street, Cambridge. 9 November 1964. 281
© Copyright 2026 Paperzz