HELP-A question answering system*

HELP-A question answering system*
by ROGER ROBERTS
University of California
Berkeley, California
INTRODUCTION
minals, in the process of using the system. Users would
ask questions of HELP, in their course of work at a
terminal, in the same maner as they would consult a
reference manual. Therefore, if HELP was to be useful,
it had to be easier and quicker to use than a manual.
A further design consideration was that many different HELP systems would be constructed, each with its
own data base and each by a different person. This
implied that a "shell" would be designed, which
contained all of the lOgic necessary for HELP, but without a data base. A working HELP system would then
consist of this "shell" and an appropriate data base.
Since many different people would be constructing
HELP systems, the procedures for building a data base
should be uncomplicated.
The above considerations led to the following conclusions. The analysis of the questions was to be kept
as simple as possible. Complex syntactic analysis was
ruled out since activation and response times had
to be low and core space had to be limited. In addition,
the relationship between the questions and the responses
had to be straightforward, so as to facilitate the
construc,tion of a data base.
That a system could be designed with these constraints was supported by work done prior to this paper.
In this preliminary version of HELP,! it was observed
that the meaning of most questions, of the type we
would encounter, is independent of word order. This
observation allowed for a design which only reacted to
particular words in a question, called KEY WORDS,
and ignored both the word order and the remaining
words. Using this mode of analysis produced a question
answering system which both conformed to the restrictions stated above and correctly answered the
questions put to it.
HELP-A Question Answering System-enables a
user, sitting at a console of a time-shared computer, to
obtain information based on questions typed in. This
information might concern the operating system itself,
the format of commands to the user interface executive
program, the use of a selected subsystem, or an area
totally separate from the computer. The content of the
data base in HELP is completely arbitrary, and determined by the creator of each individual HELP
system. Questions are presented to HELP in standard
English and elicit one or more separate responses,
depending upon the nature of the question. If HELP
cannot generate an appropriate response, a standard
"I don't know" message is output. A second system,
called QAS, was developed to enable a user to conveniently generate a HELP program. This paper will discuss
the structure of both programs. All of the work discussed in this paper was performed on a modified
SDS 930 computer, developed at Project Genie at the
University of California, Berkeley.
BASIC PHILOSOPHY
One of the maj or considerations in the design of
HELP was to produce a system with a fast response
time for the majority of the questions it encountered.
In other words, a system was desired which would
require no more than one second between the time it
was called to the time it was ready to accept a question,
and, for 75 percent of all questions, would require no
more than one second between the time a question was
terminated and the time the printing of the answer
began. It was felt that this constraint was necessary
since HELP was to be an aid to users sitting at ter-
STRUCTURE
* This
work was partially supported by Contract No. SD-185,
Office of the Secretary of Defense, Advanced Research Projects
Agency, Washington, D.C. 20325.
The primitive objects used by HELP are called
KEY WORDS. These are words which have been
547
From the collection of the Computer History Museum (www.computerhistory.org)
548
Fall Joint Computer Conference, 1970
defined previously, and only a member of this set of
KEY WORDS will be considered in the formation of
the answer. Certain sets of these KEY WORDS are
singled out as defined KEY WORD LISTS, and these
KEY WORD LISTS are used by HELP in determining
what answer to give. The basic idea is to extract the
KEY WORDS from the question, and from this set to
determine what KEY WORD LISTS are present.
For example, assume that the words "file", "open", and
"input" have been defined as KEY WORDS and, in
addition, no other words have this property. Then the
question "What is a file?" would present to HELP the
set of KEY WORDS "file", the question "How can I
open a file?" the set "open, file", and the question
"What is used to open a file for input?" the set "open,
file, input". These sets of KEY WORDS are the only
pieces of information which are extracted from the
questions and, in fact, are unordered sets. "Which
instruction is used to open an input file?" would
generate the same set as above, namely "open, file,
input". Notice also that all words not KEY WORDS
are ignored, so the question "Input file open?" would
have been just as meaningful to HELP.
Now that we have these KEY WORDS, what do we
do with them? As mentioned above, some sets· of KEY
WORDS are defined to be KEY WORD LISTS, and
these lists are used to determine what information
should be given in response to a question. When creating a data base for HELP (to be described below), a
KEY WORD LIST is defined by specifying the
KEY WORDS which comprise the list and the response
to be generated when this list is recognized in a question.
This link between a KEY WORD LIST and a response
is the major mechanism which HELP uses to answer a
question. To return to the above example, assume we
have now defined the KEY WORD LIST "file" to
have the response "A file is a collection of data .... "
Also, assume we have linked the KEY WORD LIST
"open file" to the response "The instructions to be used
to open a file ... " and the KEY WORD LIST "open
file input" to the response "To open a file for input
use .... " Now, with these definitions, we would want
the question "What is a file?" to elicit the first answer,
the question "How do I open a file?" to elicit the second
answer, the question "How do I open a file for input?"
to elicit the third. For HELP to do this, another
mechanism is required, one which can decide which
KEY WORD LIST to extract from the set of KEY
WORDS in the question. Without this additional
mechanism, we would encounter a problem. For even
though the word "file" is present in all three of the
above questions and this word is a KEY WORD
LIST itself, we obviously do not want the description
of a file to be generated in response to the second two
questions. These two questions are specific enough to
preclude that answer.
HELP decides which KEY WORD LIST to use by
the following mechanism. The set of KEY WORDS in
the question is searched to find the longest KEY
WORD LIST, and the message associated with this
KEY WORD LIST is used as the response. This
operation allows HELP to give the answers described
above. Assuming that the KEY WORD LIST of zero
length is always defined, and is linked to an "I don't
know" message, the only time we cannot find a longest
KEY WORD LIST in the set of KEY WORDS is
when we have two or more KEY WORD LISTS of
maximal length. In this case, we generate the responses
associated with all of the lists with this property. To
continue our example, assume the KEY WORD LIST
"close file" is also defined, and linked to the response
"To close a file use .... " Now let us see what happens
with the question "How does one open and close a
file?" The set of KEY WORDS taken from the question
is "open, close, file" . We first see if there is a defined
KEY WORD LIST of length 3 (the order of the original
set) . In this case, there is not. We will then find a
KEY WORD LIST of length 2, say, "open file", and
its response will be generated. We then find the other
KEY WORD LIST of length 2: "close file", and its
response is also generated. Now, since no other KEY
WORD LISTS of length 2 exist, and we have found
at least one of this length, we stop searching and consider the answering phase complete. Notice that if the
question was "How do I open or close?" HELP would
have output "I don't know", since out of the set "open,
close" the only defined KEY WORD LIST is the
default one of length zero.
Even though the above mechanisms are uncomplicated and make no use of word order, they allow
HELP to answer questions with great accuracy, and
little redundancy. The assumption that a longer list
of KEY WORDS in a question (i.e., more modifiers),
implies that a more specific answer is required seems to
be quite adequate in determining which answer is
desired. For an example of how a user of HELP can go
from the general question to the more specific, see
Figure 1.
Also shown in this figure is a facility in HELP which
will be described in greater detail below. It is the idea of
a "text subroutine", and it both aids the writer of a data
base and reduces the size of the HELP program. With
this facility, answers to less specific questions can be
built up out of answers to more specific ones. This is
accomplished by having a response "call" another
body of text, in much the same way as standard computer languages do. This means that body of text can exist just once, but can be used by many different answers.
From the collection of the Computer History Museum (www.computerhistory.org)
HELP
549
The details of the mechanisms by which HELP
attempts to answer a question are described below.
ROOT
NODE
GENERATING ANSWERS
==> : POINTER TO RESPONSE
X: "END" FLAG SET
VALUE
We first read in the question and partition it into
words. A word, in this case, is defined to be a sequence of
non-blank characters. We then look up each word in a
hash table. If the word is found in this table (i.e., it it is a
KEY WORD), we place its index in the table into
a temporary buffer. If the word is not found, we perform
some simple spelling transformations (e.g., saves-+
save, going-+go, file.-+file, etc.), and check the hash
table again. If the word is still not found, it is completely
disregarded.
After the entire question has been reduced, the
resulting set of numbers is sorted by value. If any of
these numbers are duplicated in the list, all of the
repetitions are removed, so that we get a strictly
increasing ordered set of numbers. We now present
this set to a data structure called the ANSWER
[J2Q
L-UNK
R-UNK
nI < n2< n3 < n4
nl < n5 < nS < na
n3 < n7
DEFINED KEY WORD LlSTS={n l,n 5},{n l ,n s}, {n l,ns,n a},
{n 2},{n 3,n 7},{n41
Detin1t1ona
KEY WORD LIST
RESPONSE
FILl
Ml:
OPEN INPUT FILl
O~N
otrrPUT FILl
OPEN FILl
Figure 2-Answer lists
A FILl IS A COLlECTION OF DATA.
142:
USE BRS 15 TO OPEN A FILl FOR INPUT.
M3:
USE BRS 16 TO OPEN A FILl FOR OtrrPl1l'.
~:
[M2] [M3] (WHERE [M) MEANS A "CALL" ON
MESSAGE M).
CUlSE FILl
~
:
USE BRS 17 TO CLOSE A FILl.
15
M6:
BRS 15 OPEN FILl FOR INPUT.
BRS 16
M7:
BRS 16 OPEN FILl FOR OtrrPllr. (M8)
DUAL FILl NtImER
M9:
BRS
(~)
M8: A=CONTROL WORD, X=DUAL FILE NUMBER.
A DUAL FILl NtMBER HAS THE COW.AND INPUT
FILE IN THE rowER 12 BITS AND THE COMMAND
OtrrPtrr FILE IN THE TOP 12 BITS.
Q,uestion Phase
?WHAT IS A FILl?
A FIlE IS A COLLECTION OF DATA.
?HOW CAN I OreN A FILE?
USE BRS 15 TO OPEN A FILE FOR INPUT.
USE BRS 16 TO OPEN A FILE FOR
OUTPtrr.
LISTS, which contains all of the defined KEY WORD
LISTS and pointers to their associated responses.
The ANSWER LISTS structure is a binary tree, with
each node consisting of three fields; a LEFT LINK, a
RIGHT LINK, and a VALUE field. The VALUE
field can contain either an index into the KEY WORD
hash table, or a flag indicating that this node is an
"end of list" node. The RIGHT LINK of a node is
either null, or points to another node whose value
field is greater than itself. The LEFT LINK of a node
points to either a node whose value field is greater than
itself, a node whose VALUE field has the "end" flag
set, or a response (in the text storage table) (see Figure
2).
A KEY WORD LIST and the pointer to its associated response exists in this structure as follows.
If there are n members of the KEY WORD LIST,
there are n 1 nodes in the tree which describe it; one
for each of the KEY WORDS, and one for the "end of
list" node. Each of these nodes exists on a different
level of the tree, where level has the following recursive
definition. Level 1 is defined to be the set of those
nodes which can be reached from the root node by
following RIGHT links. For m> 0, level m+ 1 is the
+
?BRS 1,5?
BRS 15 OPEN FILE FOR INPtrr.
A=CONTROL WORD, X=DUAL FILE Nm4BER.
?TELL ME ABOtrr A DUAL FILE NUMBER?
A DUAL FILE NlMBER HAS THE COMolAND INPtrr FILE IN THE LOWER 12 BITS AND
THE COMMAND otrrPtrr FILE IN THE TOP 12 BITS.
Figure I-Definitions and output
From the collection of the Computer History Museum (www.computerhistory.org)
550
Fall Joint Computer Conference, 1970
KEY WORD LIST we generated from the question
does not exist in the ANSWER LISTS structure. We
then search for pre-defined lists among the subsets of the
original list, beginning with the maximal proper ones.
If at any point during this search we find a pre-defined
list, we ouptut the response associated with it and with
all such lists of the same order, and terminate the
answering phase as above. In the event that no subset
of the given list exists in the ANSWER LISTS, a
standard "I don't know" response is generated, and
we return again to listen for the next question.
I
·CHOICE OF _ _ _
MAPPING
REGISTER
1-----1
MAPPING
REGISTERS
ADDRESS IN_
SUB-BLOCK
TEXT STORAGE
DICTIONARY
INSTRUCTION
Figure 3-Dictionary addressing
set of nodes which (a) are pointed to by LEFT links
of level m nodes and, (b) which can be reached by following RIGHT links from the nodes in (a). In other
,vords, the first node in every KEY WORD LIST is in
level 1, the second node is in level 2, etc. Now, for the
KEY WORD LIST of n members, the first member is
described by some node in level 1. The LEFT link of
this node points to a node in level 2, and the second
member of the KEY WORD LIST is in this set of
level 2 nodes (i.e., start from this node and follow
RIGHT links until the desired node is found). The
LEFT link of this node just found will point to a node
in level 3, etc. After descending n levels in this manner,
the last node of the KEY WORD LIST will be encountered. The LEFT link of this node will point to a
node with the "end" flag set, and the LEFT link of this
node will point to the response associated with the KEY
WORD LIST.
Since the nodes throughout the tree are ordered by
value, the algorithm for deciding if a list of KEY
WORDS is a defined KEY WORD LIST is quite
simple, and requires very little time to compute. A
failure exit is caused if and only if either a null pointer
is encountered when traversing RIGHT links, or an
"end" node is not encountered when the entry list is
exhausted.
If the KEY WORD LIST constructed from the
question is, in fact, a pre-defined list, we will find it in
the above structure and will reach an "end of list" node.
This node contains an identification of the appropriate
response to be generated. In this case, the question has
been answered successfully, so we go back to listen for
another question. However, let us assume that the
Since the size of the HELP program was a consideration, we elected not to store the text of the
responses literally. Instead, we utilize a dictionary and
have the response which is stored in HELP be a sequence of calls on that dictionary. Our dictionary can
contain a maximun of 2048 (text) words, while the
address field of the dictionary reference instruction
only allows for specifying one of 256 words. Therefore
to allow any word in the dictionary to be referenced, we
separate the instruction operand address into two
fields. One of these fields specifies one of four "mapping
registers," while the other field designates one of 64
words in a dictionary sub-block (see Figure 3). The
four mapping registers are given initial values at
the start of the text output phase, and instructions
exist to change their contents when necessary. Experience with several HELP programs has indicated
that the map change instructions account for only 3
percent of the total number of instructions in the text,
so this device significantly reduces the size of the transformed text.
In addition, we introduced the subroutine facility
to allow a body of text to be associated with two or
more responses, with virtually no increase in storage
space. Consequently, the internal structure of the text
which in output in response to a question is not a string
of characters, but a sequence of instructions in a simple
language (see Figure 4). As can be seen from this
figure, these instructions can be divided into several
classes (i.e., use word n from the dictionary, output a
single character, end the message, call another body of
text as a subroutine, etc.). We have found that transforming the text in this manner also considerably reduces the space needed to store the responses. The
SDS 930, for which this program was written, has a
user address space of 16K 24 bit words. The entire
HELP program resides in this space and can contain
the equivalent of from 30 to 40 printed pages of text,
enough to entirely describe the user interface of our time
sharing system.
From the collection of the Computer History Museum (www.computerhistory.org)
HELP
ADDITIONAL STRUCTURE
Since natural languages contain synonyms, a method
of defining equivalences between words was built into
HELP. This facility allows us to equate not only one
word with another, but also sets of words with each
other. When we say equate, we mean that HELP will
respond with the same answer no matter which of the
words of a synonym equivalence class is used. For example, we might cause "run" and "execute" to be
synonyms, so that the two questions "how do I run a
program?" and "how do I execute my program?"
will elicit the same response. As another example,
suppose we want the key word "help" to cause an
explanation message to be generated. In addition, we
want anyone of the words from the set: "exit, finished,
goodbye, stop" to cause HELP to return to the TSS
executive. As above, the KEY WORD LIST consisting of "help" would point to the desired message,
and the words "finished", "goodbye", and "stop"
would be equated to "exit". "exit" would, in turn,
indicate to HELP to stop its execution (see below).
TAG
OZZ
MEANING
xxx xxx
Dictionary entry preceeded by a blank..
map to be used.
ZZ =
Map contains high order 5
bits of dictionary address, XXXXXX are low
order 6 bits.
100
AXX XXX
Alpha. character or multiple blanks.
A=l ==> preceed by blank..
X>5 : X+33b=character .
x<6:
101
AXX XXX
X=# of blanks.
ASCII character or control.
A=l ==> preceed by blank..
0<=X<=15 :
X=character.
16<=X<=22: X+lO=character.
23<=X<=27: X+36=character.
110
AXX XXX
x=28:
Text subroutine return.
X=29:
End partial message.
X=30:
X=31:
suppress blank before next entry.
CR, LF.
Digit or common 2&3 letter words.
A=l ==> preceed by blank.
O<=X<=1,5 :
III
PQR XXX
MOPrBL+X=WORD address.
16<=X<=25 :
X=digit.
26<=X<=31:
MOPrBL+X-10=WORD address
Control
X=l: Text transfer.
entry (9 bits».FQR
New address=(next
X=2:
Text subroutine call
x=4:
Change map QR to field of next entry
x=6:
Subroutine call on undefined text
Figure 4-Format of 9-bit instructions for text storage
551
A problem has now been created. If a user asks the very
natural question "How do I stop HELP?" he will receive the answer about how to use HELP in addition to
terminating the program. Presumably this is not what
he wanted. The solution to this problem is to define
the KEY WORD LIST "exit help" to be a pseudosynonym of the KEY WORD LIST "exit." Now, when
the same question is asked, "stop" ,vill be reduced to
"exit," "exit help" will be reduced to "exit," "exit"
,vill terminate HELP, and no message
be generated.
In the sme way, many multi-word KEY WORD
LISTS can be equated to one another, with the resulting
desired reduction.
The synonym facility is implemented by using the
same ANSWER LISTS structure described above.
In the case of a synonym, the terminal node of a
KEY WORD LIST path in the tree indicates that the
list in question is equated to a certain node in the tree,
and does not point to a message.
There exists another piece of machinery in HELP
which has been quite useful. All of the responses which
can be pointed to by KEY WORD LISTS do not have
text associated with them. Some number of them (10
in particular) indicate to HELP to perform some
"special action." One special action is to terminate the
program. So, as above, saying "goodbye" to HELP
will cause the user to exit from HELP. Another special
action is to commence execution of another HELP
program, and have the ensuing questions directed to it,
rather than to the original program. This facility is
helpful for two reasons. First, the total amount of information about the entire system is too large for one
HELP data base. Even if the capacity of the data
base were expanded, problems of ambiguity would
arise owing to the context dependency of the questions.
Second, since different areas of the system are maintained by different people, it seemed advisable to have
the HELP programs also maintained separately, so
that one area could be modified without global repercussions. With this mechanism, any HELP program
can call any other HELP program. -A user therefore has
immediate access to all information about our system,
with very few contextual problems arising.
As mentioned above, a question can elicit more than
one response if it contains more than one KEY WORD
LIST of the same length. If a user asks a question of
this type, pressing a break key (rub out , in our case)
during the output of any of the several messages will
stop just that message. Output will then continue with
the next message in the sequence. This feature has
proved to be quite convenient, especially in the case of
long, multi-part answers where the user only wants to
see one part of each.
From the collection of the Computer History Museum (www.computerhistory.org)
".ill
552
Fall Joint Computer Conference, 1970
TEXT
FILE
I
I
I
I
I
I
I
I
I
I
I
Produces
QAS
CONSOLE
CONSOLE
UNANSWERED
QUESTIONS FILE
Figure 5-QAS and HELP
concerns the KEY WORD hash table. The algorithm
which we use to construct a hash code from a word in
the question guarantees that all words of less than 6
letters will transform uniquely (we transform the
word into base 26 representation). Also, since the hash
code is 24 bits long, while the hash table has room for
only 2 i 10 entries, the probability that two arbitrary
words will have the same hash code is quite small. We
therefore do not check to make sure that a word whose
hash code we find in the table is, in fact, the word we
want. We make the assumption that if we find a word's
hash code, we have found the word. This obviously
reduces storage, since we do not have to retain the words
themselves.
This last mechanism might seem strange, but we
felt that since this was only a question answering
facility and not, say, an assembler, we could exist with
a small amount of inaccuracy. Our experience has
shown that errors due to recognizing the wrong word
almost never occur, and that when they do, only cause
extra answers to be generated (due to recognizing an
undefined word as a KEY WORD).
C01\1PACTING DATA
CREATING A HELP PROGRAIVI
As we have discussed earlier, an important criterion
in the design of HELP was to keep its size small. Some
of the methods used have already been presented
(encoding the text into a sequence of interpreted
instructions and mapping the dictionary references).
There are also two more which will now be shown. The
convention we made concerning the dictionary was
that it would contain words of only two or more
alphabetic characters. Now, in the SDS 940, all of the
alphabetics have an internal representation of 32 or
greater. Therefore, by subtracting 32 from the last
letter of each word, we compacted the words as densely
as possible and were still able to know the locations of
the word boundaries. However, this scheme by itself
would be quite inefficient, since half of the dictionary,
on the average, would have to be counted through to
find a word. Accordingly, we have a DICTIONARY
ADDRESS TABLE of 64 entries, each of which points
to the start of a 32 (text) word block. To locate a word
in the dictionary, the high order 6 bits of the dictionary
address are used to select one of the entries in the
DICTIONARY ADDRESS TABLE. Starting from
this location in the dictionary, the nth word we encounter is the word we want, where n is the lower order 5
bits of the dictionary address. In this manner, the
dictionary is as compact as possible, and the time to
find a word is not astronomical.
The second method of reducing storage in HELP
We now describe the machinery by which a user may
create a question answering (HELP) program. Each
such program contains the code for execution and the
data base, unique to it, which indicates KEY WORDS,
text to be output, and the relationships between KEY
WORD LISTS and that text. The means of defining
these objects (i.e., creating a data base) is another
program, denoted QAS, for Question Answering
System. In QAS, a user can define KEY WORDS and
KEY WORD LISTS, can define responses to be typed
out, can associate these responses with KEY WORD
LISTS, and can create his particular HELP program.
He can also discover what objects have already been
defined, what synonym relationships exist, the size
of the various internal tables, etc. In other words, QAS
is designed to allow a person to interactively construct a
HELP data base, defining and redefining objects as he
sees fit (see Figure 5). A brief description of some of the
operations which can be performed in QAS is given
below.
1. Define a KEY WORD LIST and the named
response which is associated with it.
2. Define a named body of text.
3. Define a KEY WORD LIST equivalence class.
4. Define a KEY WORD LIST which will take one
of the "special actions" described above.
5. Edit a named response.
From the collection of the Computer History Museum (www.computerhistory.org)
HELP
6. Ask for the HELP program associated with
QAS, to ask questions about QAS.
7. Investigate the size of various tables.
8. Redefine a KEY WORD LIST.
9. Given a word, determine which KEY WORD
LISTS it is a member of.
10. Given a word, determine what synonym relationships exist between it and any other words.
11. Given a list of words, determine which of its
subsets are KEY WORD LISTS.
12. Given the name of a message, determine which
KEY WORD LISTS point to it.
13. Take the input from a file, instead of from the
console.
14. Write the dictionary on a file.
15. Create the HELP program.
553
SYNONYMS.
INPUT
READ
NUMBER
NO.
NUMBER
NO
As indicated above, each body of text given to QAS
must have a name attached. The pupose of this name is
to allow for the subroutine facility described previously.
The inclusion in a body of text of the name of another
body of text will cause the second body of text to be
inserted during message output. This second body of
text can, in turn, "call" another, etc. An example of the
input given to QAS can be seen in Figure 6.
From Figure 6 we see that the structure of the input
presented to QAS is in the form of "commands",
followed by the arguments for the command. For
example, the command, ANSWERS, haS' as its arguments a KEY WORD LIST, a text name, and a body
of text. This command will, after receiving the arguments, define the KEY WORD LIST, compile the
body of text into its internal form, associate the name
given with the body of text, and cause the newly-defined
KEY WORD LIST to point to this text. Another
command, KILL KW LIST, will erase the pointer
from the given list to a response or a synonym. This
allows us to redefine a list if the original definition was
faulty.
Figure 6 also shows the power· of the subroutine
facility. We can define the answers to specific questions
by giving the text to be output. We can then build up
the responses to the more general questions by utilizing
calls on the more specific text.
QAS has proved to be a very powerful tool for the
creation of a HELP program. Using it, a person can
construct a preliminary HELP for, say, our test editor
in a few hours. Then, after other users have tried it and
asked questions which the writer did not anticipate,
QAS can be used again to modify the data base. To
facilitate this procedure, the writer can tell QAS, during
the creation of HELP, to have HELP write on a file
all the questions it cannot answer. The wTiter can then
our PUT
WRITE
ANSWERS.
FILE
[FILE ]
A FILE IS A COLLECTION OF DATA.
OPEN nlPUT FILE
[OIF]
BRS 15 IS USED TO OPEN A FIlE FOR mpl1r.
OPEN OurPUT FILE
rOOF]
BRS 16 IS USED TO OPEN A FILE FOR ourPUT.
OPEN FILE
[OF]
[OIF] rOOF]
INPtJr NUMBER
[ INNUM]
BRS 36 IS SUED TO INPUT A NUMBER.
Figure 6-Format of input to QAS
From the collection of the Computer History Museum (www.computerhistory.org)
554
Fall Joint Computer Conference, 1970
4HELP.
TERMINATE QUESTIONS
wiTH A '1'.
QUESTIONS ABOtJr MATERIAL IN R-2!.
THIS VERSION OF HELP WILL ANSWER
TO LEARN MORE ABOUT THE USE OF
(11/'Zl/68)
HELP, TYPE "HELP?".
?HOW CAN I READ THE LENGTH OF A FIlE?
USE BRS 143 WITH ATTRIBtJrE NtMBER 2 TO READ THE IENGTH OF A FIlE.
HELP receives a great deal of use, generates useful
answers, and, using QAS, can easily be updated to allow
for modifications and oversights. Figure 7 gives an
example of a short session with HELP.
CONCLUDING COl\1MENTS
?TELL ME ABOtJr BRS 143?
READ STATUS.
BRS 143:
A:
THE CALLING SEQUENCE IS:
TABlE ADDRESS OR DATA (DEPENDS ON
a,S).
IF A TABlE ADDRESS, A is
INCREMENTED TO POINT TO ONE PAST THE LAST WORD TRANSFERRED.
X:
THINGY NUMBER
B:
DECODES AS FOLLOWS
BIT ~: ~ IF A HAS DATA, 1 IF A POINTS TO A TABLE
BITS 1-11:
'TYPJo;' OF THINGY
BITS 12-23: 'ATTRIBtJrE'
'TYPE' IS 1 FOR A DRtJoi FILE, 2 FOR A JOB, FOR DETAILS SEE PAGE 9-2
IN R-2L
?WHAT IS A THINGY?
A "THINGY", AS USED IN BRS 143 AND 144, IS EITHER A FIlE OR A JOB.
?HOW CAN I OUTPtJr A STRING?
USE BRS 35 TO OUTPUT A STRING.
?MORE ON BRS 35?
BRS 35: OtJrPUT STRING.
X=FILE NUMBER, AB=STRING POINTER.
The algorithm which we use to find KEY WORD
LISTS from the set of KEY WORDS in a question has
one drawback. If the number of KEY WORDS in the
question is large, while the length of the longest KEY
WORD LIST ill the set is small, the searching time is
very long. This is due to the fact that, starting from the
original set, we check every subset. We have never
experienced a problem of this sort, since most of the
questions presented to our HELP systems have three or
less KEY WORDS. However, it is a possibility. To
solve it, a test would have to be made before starting
to check the subsets of a given order, to insure that the
number of these subsets is less than some pre-determined maximum. If not, all of those subsets would
not be tested. The calculations involved are trivial, and
could be incorporated if necessary.
?BRS 2 3?
BRS 2:
EXEC ONLY
CLOSE SPECIFIED FILE
A=FILE NUMBER.
NON-EXEC IS BRS 2~.
BRS 3 DOES NOT EXIST
Figure 7-A session with HELP
investigate this file occasionally to discover what he
has overlooked.
It is also possible to have HELP write on a file all
of the questions which it receives. In this way, we can
determine how much use a particular HELP system is
getting, and how well it is doing. Our experience at
Project Genie is that our efforts· in designing a simple
question answering facility have been successful.
ACKNOWLEDGl\1ENTS
The author would like to acknowledge the help of both
Bret Huggins and Butler Lampson. Mr. Lampson
designed the initial structures of the ANSWER LISTS
and the dictionary, and Mr. Huggins implemented
them. In addition, both contributed many hours of
their time to discussions during all phases of the
design of HELP and QAS.
REFERENCE
1 C S CARR
HELP-An on line computer information system
Project Genie Document No P-4 January 19 1966
From the collection of the Computer History Museum (www.computerhistory.org)

Download Report

HELP-A question answering system*

Paperzz.com

Your Paperzz