V The Subject Access Project: A comparison with PRECIS Carolynn E. Bett Following a summary of the aims and recommendations of Atherton's SAP Report is a critique of her discussion on PRECIS, inter woven with corrective explanations of how PRECIS works. In the SAP Report PRECIS is confused with a bibliographic file, with document analysis, with indexing policy and with the British MARC Verbal Feature Heading. PRECIS is none of the above, but rather an indexing method based on a system of grammar, with computer codes used to create a printed index. As well as clearing up errors in relation to PRECIS, the display value of PRECIS and machine ap plications to PRECIS are noted in response to Atherton 's desiderata for a new indexing system. This paper has two purposes; firstly, to commend to a wider readership the ideas on subject access presented by Pauline Atherton in Books Are For Use; Final Report of the Subject Access Project to the Council on Library Resources; secondly, to comment on some of the statements about PRECIS (PREserved Context Index System) which Atherton freely admits were not based on careful testing, as was the rest of her report.' The aims of the research undertaken in the Subject Access Project (SAP) were to find out about: (1) availability of suitable information in books to produce augmented subject descriptions; (2) cost of inputting these subject descrip tions in machine-readable form; (3) cost of computer storage of a BOOKS data base (MARC-like records augmented with subject descriptions); (4) cost of online searching of a BOOKS database; (5) benefits derived from online searching of BOOKS.2 The research was considered a necessary preparation for adapting the conventional catalogue to the innovations of MARC (MAchine The Indexer Vol. 11 No. 3 April 1979 Readable Cataloguing), International Standard Bibliographic Description (ISBD) and online search services. While many machine-readable databases are available for 'free-text searching' of journal literature, few are available for monograph literature. Moreover, the traditional subject descriptions for books do not capture the specific information contained in each book. Atherton prepared thorough statistics on the number of books with useful tables of contents and/or indexes, giving interesting breakdowns by subject. Rules were developed for selecting subject descriptions from the table of contents and the back of the book index to augment the MARC record. The thoroughness of the testing and documentation of findings is, indeed, im pressive. Many recommendations fire the imagination for further research into fuller exploitation of our mechanized bibliographic systems. It is hoped that the recommendation for further work on an objective method of in-depth, efficient indexing3 will be carried out. Scanning the table of contents and the back of the book index, traditionally, has provided quick and specific entry into monograph literature for reference librarians. Why does it, then, come as a revolutionary suggestion to adapt this approach to online searching? The recommendation that publishers improve the quality of tables of contents and indexes4 should be taken as seriously as the cataloguing in publication project has been, to bring more quickly and directly 'every book its reader'.5 However, what do we do until publishers assume this responsibility? Rather than following through the recom mendation to put online 'a version of Roget's Thesaurus'? let us consider an online thesaurus in which the synonyms are hidden and the hierarchical structure revealed, as an aid to searching. This could be achieved by having the user-selected term matched to a number address which holds all the synonyms. In a free-text system, all synonyms could be posted to search the text without the user even being aware that 145 this was being done. Another approach would be to control the indexing terms according to the thesaurus but still leave the searcher free to use his own language, with the computer translating and searching behind the scenes.7 PRECIS provides a system for controlling indexing terms while leaving the vocabulary open-ended to accommodate new concepts. On page 4, Atherton refers to PRECIS as a 'data base of the BNB (British National Bibliography)'. This is technically incorrect and leads to a great deal of confusion, evident in later discussions of PRECIS in the SAP. PRECIS was developed by the BNB, but PRECIS is not a database. Rather, it is an indexing method based on a system of grammar, approximating natural language grammar, which throws new light on to the structure of natural language. However, PRECIS is much more rigid than most natural languages. Its grammar verbalizes concepts so that their context can be expressed: e.g. TEACHERS. Secondary Schools Attitudes to students PRECIS may be applied to any type or form of materials: books, research, A-V materials, ar tifacts, etc. There are two PRECIS files: the subject file which shows all the subject strings (i.e., sentences) together with the PRECIS reference indicator numbers; and the reference file, which shows thesaural relationships between terms. The bibliographic file at BNB is a MARC file, although not all bibliographical files indexed by PRECIS are in the MARC format. In the British MARC format there are three fields pertaining directly to PRECIS: 690, 691 and 692, with a fourth field 083 that is derived from the PRECIS string. Field 690 contains the complete PRECIS string(s) with codes; field 691 contains the Subject Indicator Number(s) (SINs); and field 692 contains the Reference Indicator Number(s) (RINs). Field 083 contains the Verbal Feature Heading, which is derived from the PRECIS string and used only in the classified catalogue of the BNB. Field 690 is held in CAN MARC 680 and is used for international sharing.8 Not all of us in Canada who are indexing with the PRECIS method are using the MARC format for our bibliographic data; however, we must all create a RIN file and SIN file (number files) which are matched to the PRECIS files to print subject indexes complete with cross references. Usually the indexes created by PRECIS programs are two-stage, with the index pointing to a classified 146 catalogue, as in BNB, or pointing to a listing by document number, as in the Ontario Educational Research Information System (ONTERIS); however, a one-stage computer-produced book and microfiche index using short bibliographic citations has been produced for Aurora High School, Ontario.9 A strict grammar can be applied only when the indexer has acquired a notion of what the book is about. This process may be likened to rough notes taken in a lecture, which are later developed into a publishable paper. Experienced PRECIS indexers often go from the rough notes stage to the finished grammatical stage in the twinkling of an eye and write down only the finished strings (sentences). The point is that Atherton deals with methods of arriving at the rough notes, otherwise known as document analysis; whereas, PRECIS deals with expressing the analysis in such a manner that the computer can print meaningful subject indexes. Those on the PRECIS team often mention that document analysis is the most fundamental problem, but as yet international guidelines for objective establishment of the subject of a document, although in progress, have not been published. It is to be hoped the results of Atherton's Subject Access Project will profoundly influence the development of such guidelines. In the section of the SAP report on comparison of the BOOKS system with PRECIS, PRECIS does not fare very well. Let me stress that Atherton has acknowledged the cursory treat ment given PRECIS and that those of us involved with PRECIS appreciate her recognition that PRECIS is to be acknowledged, studied and come to terms with, although the Library of Congress system is of consuming interest to North Americans. Atherton recognizes that in BOOKS, the database built for SAP, the subject headings derived from the table of contents and back of the book indexes are like content notes.' ° She has an indexing policy of providing up to 30 entries per book. The BNB indexing policy approximates that of the Library of Congress, namely, indexing is designed to summarize the subject of the book. The BNB, however, will index to a maximum of four strings, giving, in the permutations of the strings, many more access points than L.C. The indexing policy for the Aurora PRECIS Project more nearly approximates Atherton's in The Indexer Vol. 11 No. 3 April 1979 its analytical approach to information contained between two covers. In the Aurora Project, the strings tend to be concise (i.e., have only a few terms compared with ONTERIS strings); therefore, it may take several strings to express the multi-concepts contained in each document. The indexing policy is guided by the needs of students and teachers as related to the curriculum. At ONTERIS, our policy and materials lead to the writing of few, but rather long, complex strings giving, like Aurora, many access points for each document. In examples 6, 7, 8 & 9, Atherton reprints both the PRECIS entry and the string, thereby making PRECIS appear immensely complex with its crush of $s, Zs and 000s. The codes are, in fact, the indexer's instructions to the computer. The user is never exposed to these private con versations in the printed index, nor are they of any interest to one wanting to grasp how PRECIS works. 'Shunting' of the strings to show the various access points would have been more informative. A PRECIS entry is designed to show context dependency in two directions and is therefore designed in a two line format. The Lead, or entry term into the index, is in the top left position. Moving towards the right on the top line one reads increasingly broader terms called the Qualifier. The narrower context on the line below is the Display. A simple example of this format is: CATHEDRALS. Canada Architectural features When the broadest term is in the lead, there is no qualifier. Thus, the shunted entries from figure V.6 on page 74 are as follows: GREAT BRITAIN Urban regions. Social planning —Conference proceedings URBAN REGIONS. Great Britain Social planning—Conference proceedings SOCIAL PLANNING. Urban regions Great Britain —Conference proceedings Figure V.2 would have given an interesting example of shunting. However, the entries as they appear in Atherton's report are not written in the correct PRECIS format; rather, they have been extracted from the BNB classified catalogue verbal feature heading (field 083 of the British The Indexer Vol. 11 No. 3 April 1979 MARC format), (which is explained beginning p. 399 of the PRECIS manual by Derek Austin). Field 083 is never shunted; however, the shunted entries for the subject were found in the 1971 PRECIS index. PRECIS I was superseded by PRECIS II in 1974 with drastic changes being introduced at that time. While it might not have been possible for Atherton to have checked the latest edition of the SIN file to extract examples for her publication released in February 1978, examples should have been taken from PRECIS II indexes as was Figure V. 1. Similar errors occur in Figures 3, 5 and 6. It is unfortunate that Atherton accepted her in formation from a misinformed colleague, without verifying it herself. PRECIS was developed initially in order to enlist the aid of the computer in printing subject indexes. In a sense, it is unfair to compare PRECIS, a contextual system, with a system that was designed specifically for computer searching using Boolean logic with single-concept terms. However, computer searching is the mode we are now working with and it is useful to see how the various systems can be adapted for online use. Ann Schabas is comparing PRECIS strings with Library of Congress Subject Headings (LCSH), and published results thus far indicate that PRECIS gives better performance than LCSH.11 The online mode gives the searcher flexibility to alter the search strategy as the search progresses; therefore, it seems likely that PRECIS would give even better performance online than in the batch mode used by Schabas. At ONTERIS, searchers, using their own terms, with a thesaurus, can ask for a display of the PRECIS entries in which their terms appear. Upon seeing the terms in context of the PRECIS entry, the relevant abstracts may be chosen for display and printing. However, because of the small size of the ONTERIS file and the detail of the sorting, usually one string points to only one document. This is definitely not true of the BNB file, where one string has been used to index many documents. In the larger file, a selection stage showing the search terms in the context of PRECIS strings should prevent many false drops. Atherton is concerned with the online display of subject terms.'2 In considering the example on page 69, the PRECIS entry is more informative and less dense to scan than the index terms (IT). The display value of PRECIS might well be considered, particularly since BOOKS weighs 147 terms in relation to however many pages in a book a term covers.14 PRECIS, however, gives weighting to terms through a precise and meaningful grammatical relationship. The weighting is implicit in the structure and does not have to be noted separately. In Figure V.2, the verbal feature heading, derived from PRECIS, is closer to natural language and is, therefore, easier to read and absorb. The extra information in the ITs could have been captured in extra PRECIS strings, had the indexing policy so demanded. In Figure V.9, it is claimed that the same classification number was searched in PRECIS as in BOOKS. However, the BOOKS number is PN1998 A2 P48, being interpreted: Motion pictures Biography Collective (Main entry .Phillips) Whereas, the BNB number is PN1998.A3C8, being interpreted: Motion pictures Biography Individual (Main entry Cukor)1' Needless to say a biography collection turned out rather more name entries than a biography on a single person. Doubtless, this slip was part of the cursory treatment given PRECIS; however, an error like this might cause someone to dismiss PRECIS unjustly. PRECIS needs to be compared with BOOKS in terms of cost and time of input and retrieval, cost of storage, time and effort needed to train indexers, and relevance of output. However, in the comparison, the indexing policies should be comparable and the fullest power of the com puter should be utilized. NOTES 1 Atherton, Pauline. Books are for use; final report of the Subject Access Project to the Council on Library Resources. Syracuse, N.Y.: Syracuse University School of Information Studies, February 1978. Research Studies No. 4. p. 68. 1 Ibid., 3rd prelim. page. 3Ibid., p. 87. 4Ibid., p. 87. 3 Ibid., p. 129. 6Ibid., p. 86. 'Austin, Derek. PRECIS: A Manual of Concept Analysis and Subject Indexing. Council of the British National Bibliography, Ltd., 1974, p. 412. 1 Canadian MARC Communication Format: Monographs. 2d ed. Ottawa: National Library of Canada, 1974, Appendix H, p. 14. 148 'Aurora High School Library. PRECIS Subject Index Catalogue. Toronto: University of Toronto Library Automation System, 1977. 10Atherton, p. 83. 1 'Schabas, Ann. Machine Searching of UK MARC. In Wellisch, Hans. Proceedings of the International PRECIS Workshop, University of Maryland, 1976, p. 149-153. t '7Atherton, p. 86. 1J Library of Congress Classification Schedule. MAtherton, p. 21. «—i Ms. Bett is developing an online thesaurus as a search tool for the databases at ONTERIS (Ontario Educational Research Information System), a research project funded by the Ontario Ministry of Education. ONTERIS produces a printed PRECIS index to volumes of detailed abstracts and also provides for online displays of PRECIS strings when the databases are queried by Boolean logic according to the conventions of ISIS programs. * The Society offers its congratulations to Derek Austin, to whom the American Library Association has awarded its Margaret Mann Citation for 1978 'in recognition of his significant contribution to the establishment of a new direction in subject analysis through the development of the PREserved Context Indexing System, his persistence in refining and perfecting its techniques, and his skill in defining, inter preting and sharing its concepts'. A progress report on the PRECIS Translingual Project describes the switching procedure from the source language lexicon through a pivot file, which consists of coded addresses of all equivalents for a given concept, to the target language lexicon. Work is nearing completion on the translation of prepositions, the addition of articles and the provision of in flexions where appropriate.' Nevertheless, the Library of Congress reports that it has studied the feasibility of adding PRECIS strings to its MARC cataloguing data and has concluded: 'In view of the fact that the addition of PRECIS strings to all current cataloguing would cost approximately $1,000,000 per year and that there has been no demand to do this, the Library of Congress will not seek money from Congress or from any other source to maintain two subject heading/indexing systems.'2 m p British Library Bibliographic Newsletter (10), August 1978, 2-3. Services Division 1 Library of Congress information bulletin 37 (9), 3rd March 1978, 154. The Indexer Vol. 11 No. 3 April 1979
© Copyright 2024 Paperzz