Computers and the Humanities 33: 293–318, 1999. © 1999 Kluwer Academic Publishers. Printed in the Netherlands. 293 Access to Pictorial Material: A Review of Current Research and Future Prospects CORINNE JÖRGENSEN School of Information and Library Studies 534 Baldy Hall University at Buffalo Buffalo, NY 14260-1020, USA (E-mail: [email protected]) Abstract. Rapid expansion in the digitization of image and image collections has vastly increased the numbers of images available to scholars and researchers through electronic means. This research review will familiarize the reader with current research applicable to the development of image retrieval systems and provides additional material for exploring the topic further, both in print and online. The discussion will cover several broad areas, among them classification and indexing systems used for describing image collections and research initiatives into image access focusing on image attributes, users, queries, tasks, and cognitive aspects of searching. Prospects for the future of image access, including an outline of future research initiatives, are discussed. Further research in each of these areas will provide basic data which will inform and enrich image access system design and will hopefully provide a richer, more flexible, and satisfactory environment for searching for and discovering images. Harnessing the true power of the digital image environment will only be possible when image retrieval systems are coherently designed from principles derived from the fullest range of applicable disciplines, rather than from isolated or fragmented perspectives. Key words: content-based retrieval, image databases, image indexing, image retrieval 1. Introduction Rapid expansion in the digitization of images and image collections (and their associated indexing records) has vastly increased the numbers of images available to scholars and researchers through electronic means, and access to collections of images is a research focus of several of the Digital Library research initiatives underway at universities across the nation. Current interest in accessing visual materials is also spurred by the use of multimedia in education and entertainment and an accompanying commercial sector investment in profitable large-scale imagebases aimed towards the publishing and entertainment industries.1 Other large scale imagebases include newswire collections and collections such as a patent and trademark imagebase containing over 600,000 images. These and other large-scale imagebases require new techniques and applications for retrieval and have spurred new initiatives aimed at improving access to collections of visual materials. Current research is focusing on several broad areas: image indexing and classification; image users and image uses; and machine methods 294 CORINNE JÖRGENSEN for image parsing. There are a number of theoretical and practical questions to be addressed within each research area. A lingering theoretical and philosophical question for image indexing and classification is the appropriateness of text-based retrieval for materials in other cognitive modalities than text. Classification questions include the range of image attributes which needs to be represented and balancing the representation of domain-specific images meeting specialized needs with the need for a generalized, domain-spanning indexing system for images (paralleling a similar problem in natural language processing). On a practical implementation level, there is the question of how to balance the access needs of users with time and money constraints on the classification process itself. In the area of users and uses, there is a great need for research leading to a better understanding of the cognitive processes involved in searching for an image, as well as the fundamental question of the meaning of similarity in a visual perceptual environment. Although the utilization of computer methods for retrieval of fundamental image attributes such as color and texture is an active and well-funded area for research, it is not yet clear exactly what the appropriate role for these methods is in relation to user needs. An additional problem area is the ability to share image information among different organizations and across different computing platforms. While the application of computerized methods for retrieval is fundamental across the various research initiatives, a concern is the lack of communication among researchers in these different areas. Discussions of indexing and classification systems and of user needs/uses remain generally in the library and information science (LIS) literatures, while research in machine methods is reported in conferences such as Photonics West and journals in computer science. With the exception of a few researchers and projects, neither community appears to be actively drawing upon other potentially useful literatures (such as cognitive science) to develop and specify systems for accessing images. This research review will familiarize the reader with current research applicable to the development of image retrieval systems. The discussion will cover several broad areas, among them classification and indexing systems used for describing image collections; research initiatives into image access, and prospects for the future of image access, including an outline of future research initiatives. Harnessing the true power of the digital image environment will only be possible when image retrieval systems are coherently designed from principles derived from the fullest range of applicable disciplines, rather than from isolated or fragmented perspectives. The following, it is hoped, will provide a foundation for further exploration and will be a touchstone for further discussion and collaboration among the diverse communities of researchers concerned with image access and retrieval. 2. Classification Systems Providing access to collections of images has long been a problem for the library community, and through the years, librarians and archivists have dealt with this ACCESS TO PICTORIAL MATERIAL 295 problem by creating classification systems for images, subject heading lists, thesauri, or other lists of index terms from which assignment of textual indexing terms takes place. These systems were based on expertise gained through years of working with print image collections and generally reflected personal knowledge, the subject requirements of collections, or the needs of specific groups of users. One of the earliest was developed with the goal of universality of subject coverage and places pictorial materials into three broad categories: history, art, and science (Simons and Tansey, 1970). Some of these classification systems focus on a specific area, which may be limited in scope by discipline, format, style or period, or other criteria of interest to a narrow research community. Others are designed to be applicable in more generalized collections of images but are frequently influenced by historical or art historical considerations such as subjects/themes or the iconographical interpretation of images. A survey of these early guides reveals a variety of approaches, even within a single scheme. These classification systems offer no or few scope notes and are an eclectic mix of subject, iconography, time period, and various other access points. The schemes range from the seemingly very simple, with only nine major headings (Evans, 1980), to more complex ones in which a heading such as “nature” (a single main heading in Evans) is broken into two main headings, depending on whether nature is “wild” or “domesticated” (Green, 1984). Often the impetus for codifying these schemes in print from was the realization that the knowledge used to organize extensive picture collections resided with one person in a particular setting and could not easily be transmitted to new personnel, should the need arise (Gilbert, 1973). Hourihane (Hourihane, 1989) provides a useful review of a number of classification systems for images and objects, especially for fine arts, and describes the wide variety found in approaches and structure. For instance, systems may employ subject headings or a thesaurus structure (LC Thesaurus for Graphic Materials; Princeton Index of Christian Art, Art and Architecture Thesaurus); may employ several levels of classification; and may be hierarchical or non-hierarchical within levels (A Subject Index for the Visual Arts, Yale Center for British Art). Natural language, numerical notation, codes, or a combination of several of these are used to describe images. Natural language-based systems may either use a controlled vocabulary or free-text indexing using detailed descriptions. 2.1. GENERAL SYSTEMS In response to the needs of image indexers and users, several larger-scale initiatives have been undertaken recently to develop more generally applicable image indexing systems. The two most well-known and widely used of these systems in the United States are the Art and Architecture Thesaurus and the Library of Congress Thesaurus of Graphic Materials. 296 CORINNE JÖRGENSEN 2.1.1. Art and Architecture Thesaurus Work on the Art and Architecture Thesaurus (AAT) began in 1979 and the first edition was published in 1980. The AAT is a controlled vocabulary for describing and retrieving information on fine art, architecture, decorative art, and material culture. The thesaurus was inspired by the problems encountered by an architectural historian and its original intent was to create a “Universal Access System” for slides. From these beginnings, the project evolved into the idea of a broader access tool in the form of a hierarchically-structured thesaurus based on the collaboration of scholars in the field. The development of the AAT also relied on input from cataloguer surveys as well as expert opinion. It is used in collections ranging from a few hundred to over a million items, and organizations utilizing it include slide, drawing, and photograph collections, archives, museums, libraries, indexing services, architecture and design firms, and art and architecture dictionary and encyclopedia projects (AAT 21–22). Work on the AAT also spanned the move from print to electronic catalogs, and computerized versions of the thesaurus have evolved from a flat AASCI data file with no retrieval software to single user software running on a PC with database and word-processing functions (Art & Architecture Thesaurus: Authority Reference Tool! , Version 2.1). A machine-readable version can also be imported into existing or developing applications, allowing its integration beyond a stand-alone tool for data entry and searching. It is also available as an online service through The Research Libraries Information Network (RLIN) and the Canadian Heritage Information Network (CHIN), as well as on the World Wide Web as the Art & Architecture Thesaurus Browser (http://www.gii.getty.edu/aat_browser/), which allows browsing and searching of terms by exact match or keyword match in scope notes or other fields and display of hierarchies. Several interesting online research and demonstration projects are utilizing the Art & Architecture Thesaurus. One demonstration project currently under developement within the Getty Information Institute’s Technology Research & Development group is Arthur (ART media and text Hub and Retrieval System, available at http:www.aphip.getty.edu/arthur). Arthur uses the AMORE image system, developed by NEC USA, Inc. to index and search 30,000 images and associated text of several hundred selected Websites. Images can be retrieved by image similarity, contextual similarity (utilizing text in the web page near the image), or by keywords in the web pages. Additional access is provided through the use of the Getty Information Institute’s Vocabularies, which includes the AAT, The Union List of Artist Names (ULAN), and The Getty Thesaurus of Geographic Names (TGN). Two other examples of online projects utilizing the AAT are the GRASP project (http://arttic.com/GRASP/public/News03/News03/GRASP-Ontology.html), a system for the description of stolen or recovered art items, and The National Graphic ACCESS TO PICTORIAL MATERIAL 297 Design Image Database at Cooper Union (http://ngda.cooper.edu/#info), under development at the Herb Lubalin Study Center of Design and Typography in the School of Art. The project aims to build a virtual visual encyclopedia through an electronic community of students and educators, and the site is intended to explore issues related to the history and theory of graphic design, especially the expansion of the vocabulary used to describe graphic design. The AAT in its latest edition has evolved into a complex tool with seven broad facets (Associated Concepts, Physical Attributes, Styles and Periods, Agents, Activities, Materials, and Objects) encompassing thirty-three hierarchies. As it had earlier been noted that increased time was needed to catalog slide collections using the first edition of the Art and Architecture Thesaurus (Besser and Snow, 1990), (226), a companion volume, the Guide to Indexing and Cataloging with the Art and Architecture Thesaurus, was published simultaneously with the second edition of the AAT by Oxford University press. Soergal (1995) provides a thorough review of the AAT and describes it as a significant achievement, yet concludes that in order to reach its full potential, it needs a painstaking and thorough restructuring to create a more useful polyhierarchical network of concept relationships. Rather than using a specialized classification system, librarians and picture archivists have sometimes tried to apply an existing general classification system, such as the Dewey Decimal System or the Library of Congress Subject Headings, to an image collection. These attempts have generally not been successful (Enser, 1993; Frost, 1996) leading to the conclusion that such systems provide a very sparse language for image indexing, particularly in image content description. This recognition stimulated further efforts to create generalized systems which will be adequate for images. 2.1.2. LCTGM As Parker (1987) (v–viii) describes in the introduction to the first edition, the LCTGM was created in response to requests from catalogers for terms that could be used for image description. It “provides a substantial body of terms for subject indexing of pictures, particularly large general collections of historical images” and “offers catalogers a controlled vocabulary for describing a broad range of subjects, including activities, objects, and types of people, events, and places depicted in still pictures.” It does not aim to provide coverage of art historical and iconographical concepts but does supply terms for some abstract ideas. The main source of terminology for the LCTGM is the Library of Congress Subject Headings, but terms are also drawn from the Art and Architecture Thesaurus, the Legislative Indexing Vocabulary, and all terms found in the thesaurus Descriptive Terms for Graphic Materials: Genre and Physical Characteristics Headings (GMGPC), to be used in cases where types and formats of graphic materials are the subjects of images. The second edition has added additional terminology and has separated out the genre and physical characteristics headings for the convenience of the user. 298 CORINNE JÖRGENSEN The LCTGM is strongly reflective of the LCSH subject headings, with almost 30% of its terms covering historical concerns such as industrial, political, and social development (Jörgensen, 1996). It contains some terms relating to abstract concepts, emotions, and thematic and “story” elements, as well as a large number of “object” terms. Greenberg (1993), in a comparison of the Art and Architecture Thesaurus with The Library of Congress Thesaurus for Graphic Materials, notes the difference between “terms” in a thesaurus, which represent a single concept and may be coordinated at the time of use, and “subject headings,” which name a subject and are precoordinated. The high degree of precoordination of the LCTGM constrains application of many of its terms, and while it serves its purpose of providing historical access to pictorial materials, especially those concerned with the history of the United States, it appears to lack utility for more general applicability to a wider variety of pictorial materials, in which a broader range of attributes may be important. A research project utilizing an enhanced organization of LCTGM was conducted at the Institute for the Learning Sciences, Northwestern University (Gordon, 1998). The Deja Vu Project created a browsing system using new links among LCTGM terms called “Expectation Packages,” which group index terms together in schema-like arrangements under five main categories of Events, Places, People, Things, and Miscellaneous, with the event being the locus of the organization. For instance, the components of an Expectation Package for “Going to a pawnshop to hock some valuable items” contains: Events: Secondhand sales, Usury Places: Pawnshops People: Poor persons Things: Clocks & watches, High-fidelity sound systems, Jewelry, Musical instruments Misc: Debt, Poverty (Gordon, 1998), 403 The system was tested at two locations using the LCTGM to access visual materials, the Library of Congress Prints and Photographs Division, and the North Dakota Institute for Regional Studies in Fargo, ND. The research found three benefits to the system: improved access for subject content searches, bringing the thesaurus to the front of the search process, and making digital materials viewable from the search tool. Disadvantages included the limitations of a single thesaurus for access, the need for traditional query-based search tools, and the subjectivity of the Expectation Packages. Other controlled vocabularies would require Expectation Packages with different slots, and further research is needed to determine if in fact different Expectation Packages would be required for very specialized domains, such as medicine. Nevertheless, the system demonstrates that there may be utility in the application of cognitively-derived organizational structures to existing vocabularies. 299 ACCESS TO PICTORIAL MATERIAL 2.2. “ ONTOLOGIES ” OR A CLASSIFICATION BY ANY OTHER NAME ... Research in “Content-Based Retrieval” (discussed more fully below) concentrates on the application of machine methods to lower levels of attributes (e.g. color, texture) for indexing and retrieval of images. Of interest to the discussion on classification systems is the fact that this community of researchers is recognizing that some form of textual classification is a useful adjunct to these online systems. This community of researchers frequently refers to a classification system as an “ontology” or “world model.” A formal ontology is populated with concepts, among which are established relations, typically of the types “is,” “has,” and “belongs to.” Domeshek et al. (1996) note that much work has been done in the area of ontology creation, but ontologies generally remain “difficult to build, to agree upon, and to specify in directly reusable fashion” and suggests semi-formal specifications of ontologies composed of both free text and formal representation. While a number of content-based image retrieval systems describe the use of an “ontology,” those in the LIS community will recognize these ontologies as simple hierarchical classification systems. As an example, the creators of the WebSEEk image retrieval system found that their users primarily initiate subject-based queries and prefer to navigate within clearly defined hierarchical semantic space. They created an ontology composed of more than 2000 classes using a multilevel hierarchy. Their system gathers candidate terms for the ontology by searching the meta-tags and URLs associated with online images; these terms are then verified with human assistance. However, a closer look at the ontology reveals problems with mixed hierarchical and semantic levels. For instance, the top level contains terms referring to subjects and topics (insects, music), format (photography, photographs), and genre forms (humor). Following links to lower levels (nature) finds the same problems (mixed subjects and formats such as rivers and videos). Systems with mixed organizational terms such as these often prove confusing and frustrating to the user. Thus, the application of classificatory techniques in these systems is rather a “good news, bad news” scenario. While it is heartening that classification is being recognized as a value-added process in automated image retrieval systems, it is disheartening that many creators of these systems are not turning to the community with the most expertise and experience in the creation of classification systems. End-users will not continue to use systems which seem arbitrary and illogical, no matter how initially appealing their automated retrieval capabilities may be. 2.3. CLASSIFICATION BY COMPUTER As mentioned above, there is a basic question as to whether text-based indexing is the most appropriate retrieval mechanism for materials in different modalities, such as images, video, and sound, and researchers from widely different disciplines have 300 CORINNE JÖRGENSEN called for investigation into a more direct retrieval method than using text as image surrogates (Rorvig et al., 1988; Stam, 1989; Besser, 1990; Seloff, 1990; Svenonius, 1984). Computer scientists are actively developing “content-based” techniques for the direct retrieval of visual image content such as shape, texture, color, and spatial relationships. These techniques allow a user to by-pass the use of text in searching by selecting an area of an image and requesting the system to “get more like this.” In addition to indexing individual images, the techniques are useful for partitioning very large sets of images. 2.3.1. Content-Based Retrieval A technique that is being widely investigated is the use of color histograms to represent images. A color histogram is a visual representation of the color distribution in an image, and recent research has demonstrated the utility of these in characterizing images. However, color histograms do not necessarily produce unique object or scene identifiers, as a single color histogram can map back to many different objects or scenes. A second level of analysis of co-occurrence of colors may be useful as well. Texture is another basic visual feature used in content-based retrieval. Haralick (1979) characterizes texture as an “organized area phenomenon” with two basic components: basic units (primitives) out of which the texture is composed and spatial distribution of the primitives. While there have been efforts to define basic “textons,” (Julesz, 1981) a number of elements have been proposed, and texture remains difficult to define. Most texture analysis models texture as a two-dimensional, gray-level variation (Ahuja, 1992). Shape-similarity-based methods retrieve objects which match or are closely similar to a given shape or image. Such methods are familiar in automated manufacturing, where visual inspection of products is done by the computer. Object shape is determined by edge detection, region growing, or a combination of both. Edge detection is concerned with finding locations in which specific types of changes occur; region growing is concerned with finding areas where no changes occur and attempts to answer which pixels are of the same surface. No technique has yet been developed which can automatically and consistently identify high-contrast regions as easily as the human visual system. Therefore, the process of identifying and outlining objects in arbitrary, unconstrained images currently requires human intervention and guidance, with the computer assisting by providing refinement techniques and flexible user control. An iconic or graphical search is a search in which direct input such as a shape drawn on the computer screen is used as the search key. In order to process graphical or iconic commands, the query image must be captured and then undergo the same types of representation and interpretation processes as the stored images before matching can take place. Matching must be flexible enough to refer to attributes at different levels of detail and must accommodate a range of uncer- ACCESS TO PICTORIAL MATERIAL 301 tainty. This process is exceedingly complex, and, in order to be successful, requires applications of artificial intelligence as well as sophisticated algorithms. In addition to the need to derive useful and automatically extracted representations of image features is the need to determine measures of similarity among these representations once they have been created. Gupta et al. (1997, p. 38) explain this problem using texture as an example. They describe texture as composed of randomness, periodicity, and directionality, which can each be represented by a number. They continue: Consider an image with 10 different regions, each with a different texture value. These texture values form 10 points in the randomness-periodicitydirectionality coordinates. How similar is this image with another that has 10 other textured regions? Many distance functions are defined between point sets . . . How well do these distance functions portray the human sense of difference in appearance? We have seen no thorough investigation of the issue to date. 2.3.2. Current Systems Two widely written about content-based systems are QBIC (Query by Image Content), and WebSEEk. QBIC, a software product originally developed by the Machine Vision Group at IBM’s Almaden Research Center, uses content-based retrieval of color, texture, shape, position, or combinations of these. Several methods have been used to input a search. Attributes such as color and texture may be chosen from visual palettes, while shapes can be drawn and regions can be specified with a mouse. An X-Windows version of QBIC was tested by the Art and Art History Department Slide Library at the University of California at Davis on a database of approximately 2000 images (Holt et al., 1997). The user performs queries based on example images; a thumbnail image is displayed, and the system can search for other images with similar color, texture or overall layout. The user can also use graphical tools to specify arbitrary characteristics such as a color histogram: 20% of a specific shade of blue, 30% of a shade of green. The search will return results in the form of thumbnail images arranged in descending order of match to the user’s query. Text attributes such as the artist’s name or media can also be used to restrict the search. A web version of QBIC continues this research (http://libra.ucdavis.edu; http://wwwqbic.almaden.ibm.com/) and adds a new color layout function that allows searches for color in a specific location. Results from this system are reported as being quite variable. For example, searches for shapes in fine art images are problematic. The study’s initial conclusions were that while QBIC cannot as of yet replace conventional database tools for thematic searches it can provide additional capabilities as a sorting tool, not as a searching tool. WebSEEk (http://www.ctr.columbia.edu/webseek) is described as “a semiautomatic image search and cataloging engine whose objective is to provide a visual 302 CORINNE JÖRGENSEN search gateway for collecting, analyzing, indexing and searching for the Web’s visual information” (Chang et al., 1997, p. 66). As described above, WebSEEk extracts key terms from URLs and html tags for indexing and combines these with some content-based retrieval of color features using binary color sets and color histograms. Content-based retrieval in this system is limited because of the vast number of images on the web. The current system had indexed (as of 1997) 650,000 images on the web and 10,000 video sequences. The creators of WebSEEk are also researching the issues involved in querying the web using different search engines. MetaSEEk (http://www.ctr.columbia.edu/metaseek), a prototype metaimage search engine, allows browsing and searching of random images from several search engines: VisualSEEk/WebSEEk, QBIC, and Virage. Chang et al. (1997) discuss multiple difficulties revealed by the project. The challenges in such an endeavor become quickly apparent to an end-user. A quick search by the author for “fear” resulted in an image of an old-fashioned icebag; clicking upon this image in the “find more like this” mode retrieved a silhouette of a witch on a broomstick! Two other software demonstrations which may be visited on the web are the Berkeley Digital Library Project (http://elib.cs.berkeley.edu/cypress) and the Excalibur Image Surfer (http://isurf.interpix.com). A website on “Visual Information Management” bringing together many links to research groups, publications, software demonstrations, and image sites is found at: http://rfv.insalyon.fr/∼jolion /SESAME/T2/T2-7.html. 2.4. SUMMARY There is currently great interest in research of this type, yet many of the contentbased techniques that are evolving are the ones that are currently computationally possible. While shape, texture, and color all appear intuitively to be important in an image, there is as of yet little understanding of whether these features on their own should be the focus of attention in retrieval, or whether these features are more important because of what they uniquely contribute to the holistic perception of the image. One researcher notes that text-based search techniques remain the “most direct, accurate, and efficient methods for finding ‘unconstrained’ images and video” and “challenges remain in applying the [above] content-based image search tools to meet real user needs” (italics added) (Chang et al., 1997). In addition, while many advances have occurred in the areas of automatic shape, color, and texture recognition, practical, reliable and consistent implementation of these techniques for image retrieval remains far in the future. Romer (1995) notes that “The reality for this technology is that completely automatic content-based recognition is on a VERY distant horizon. It is much more likely that the cooperative efforts of text-based and content-based methods will yield the most interesting and useful results for representing image and motion content for a very long time to come.” Thus, the problems of image retrieval ACCESS TO PICTORIAL MATERIAL 303 cannot be addressed by computational techniques alone; much basic research is still needed in more fundamental questions such as which features are most usefully represented using these techniques and the nature of similarity in a visual environment. As can be seen from the above discussion, a variety of different classification systems for images have been created from a number of different disciplinary, theoretical, or functional perspectives. Classification systems may be created based upon the contents of a specific collection or for the users of a specific collection. The focus may be upon a single collection or the concern may be to provide links among related collections. Concerns for image users may include information beyond the visual content of the item, such as the circumstances of production, technique or medium, the history of the item, or meaningful links among related items. A number of sources may be used to gather text terms to apply to images, among them subject headings, thesauri, free-text description, or associated texts. With the large scale digitization of image collections and widespread availability of images in large databases (especially the World Wide Web), there is a realization that a better understanding of the process of searching for images is needed. While there have been a large number of studies devoted to textual information seeking and several models of this process have been proposed (among them (Ellis, 1987; Bates, 1989)), there have been far fewer published studies on how people search for images. Current research efforts focusing on these and other questions related to image indexing and retrieval are discussed below. 3. Research While there are many questions that need to be addressed, there are several basic areas which need a stronger empirical foundation from which to begin to answer these questions: the attributes needed for effective image indexing, the types of queries put to image collections and imagebases, the users of these collections, the tasks and image uses of the users, and the cognitive processes involved in searching for images. While answers in one area can inform decisions in another area, when trying to specify optimal methods for image retrieval it quickly becomes apparent that research and knowledge is needed in all of these areas. It should also be noted that research in these areas can be time-consuming and difficult. The nature of these questions requires human-centered research and detailed observation of processes, and qualitative methods are often the most appropriate tool. Therefore, research of this type has been more limited than in other areas of information seeking. The review below will summarize some key studies and advances in understanding in these areas. 304 3.1. CORINNE JÖRGENSEN ATTRIBUTES Questions concerning which image attributes are necessary for image retrieval fall into two groups, those focusing on the range and type of attributes necessary, and those concerned with the level of “granularity” needed in indexing. 3.1.1. Range and types of attributes The data on unconstrained image description is sparse. Romer (1993) conducted research that contributed to the development of the Kodak Picture Exchange application for commercial photography. Image search questions (in the words of the originator) were collected from both image owners (photo agencies) and image users (graphic artists, art directors). She found that search and review techniques focused on such attributes as compositional qualities; artistic techniques, genre, medium; subjective aspects; and spatial aspects. Continuing research by O’Connor [O’Connor, 1996] found a wide variety of concepts and levels of specificity in a small sample of image descriptions. He found that when picture descriptions were elicited using a form with three sections (Caption, Descriptive Words and Phrases, and Reactions), subjects would list non-object terms describing a metaphor, story, interior monologue, or emotional reaction, as well as observed objects. This suggests the possibility of functional as well as topical searches. Jörgensen (1995) conducted exploratory research investigating attributes typically described by subjects in several types of tasks using pictorial images. The goal of the research was to describe as full a range of attributes as possible, and the numerical distributions and conceptual relationships among these attributes. Participants performed describing, sorting, and searching tasks, and content analyses of word and phrase data was used to define a number of image attributes falling into twelve higher level attribute classes. The data suggest that indexing of literal objects is of prime significance, as is indexing of the human form and other human characteristics. The concept of location of specific items within an image occurs frequently and may be useful in image indexing, although not necessarily only by inclusion of positional terms. Color is both typically and consistently described and appears to cue attention to certain attributes or areas as well as providing a holistic visual impact. This research demonstrated that “Content/Story” and other abstract and affective attributes are also typically described, suggesting that image indexing may benefit by the addition of more subjective aspects of images than have traditionally been addressed by image classification systems. Analysis of a sorting task further supported the importance of such attributes as theme, setting, and other “story” elements such as relationships among people depicted in the image. The combination of data from several tasks suggests that the research data comprises a fairly complete range of pictorial attributes found in spontaneous image descriptions by non-specialist users. Other research has evaluated attributes in terms of users’ search failures. ACCESS TO PICTORIAL MATERIAL 305 In research conducted by Hibler et al. (1992), retrieval failures occurred mostly because of indexing omissions. The two most frequently omitted categories responsible for search failure were omission of what appeared to be a relatively minor detail (an item of clothing) and failure to index frequently occurring objects (walls, roof). The high specificity of the indexing language in this experiment increased precision but decreased recall, so that searches conducted on general terms (hats, landscapes, eating) were less successful, omitting many items considered relevant to searchers. 3.1.2. “Granularity” of attributes An additional factor relating to an attribute is the issue of granularity: upon how many semantic or hierarchical levels should access to image attributes be provided? There are two issues here, one being the issue of what an image is “of” and “about” (an image is “of” a flag but is “about” patriotism), and the other being upon what level of specificity an index term should fall (e.g. fruit/apple/Red Delicious). The issue of what an image is “of” and “about” draws upon Panofsky’s (1962) discussion of iconography and iconology (item depicted and its symbolic or referential meaning) and has been discussed at length in the literature (most notably by (Shatford, 1986) and (Drabenstott, 1986)). Studies of queries (discussed in 3.2) demonstrate that multiple levels of attributes are needed, ranging from the specific item named to the generic category of an item to the item’s “meaning.” Research from cognitive psychology using images of individual objects as stimuli suggests that objects and colors are most frequently named on what is termed the “Basic Level” (Berlin and Kay, 1969; Rosch et al., 1976; Smith et al., 1978). This Basic Level is neither the most specific nor the most abstract level but is rather an intermediate level, such as “apple” in the above example. Research demonstrates that basic level concepts are categorized faster, are used almost exclusively in free-naming tasks, are learned sooner than other types of concepts, and are employed similarly across different cultures (Lassaline et al., 1992) Basic Level categories thus fulfill what is termed the “principle of cognitive economy” by being both informative and general. Jörgensen (1995) found that the majority of attributes (including both perceptual and interpretive) named in image describing tasks were named on the Basic Level; this may have implications for term choice in indexing images. She also found that freely generated terms describing both physical and more abstract aspects of images showed less variability than might have been expected, suggesting some constraints may exist on the process of communicating about visually perceived data. 3.2. QUERIES Several studies focus on queries to image collections. Enser (1993) analyzed almost 3000 recorded requests contained in 1000 request forms submitted to a large European picture archive, the Hulton Deutsch Collection Limited. Enser 306 CORINNE JÖRGENSEN described users as representing their requests at a greater level of specificity than users of online catalogs, with the majority of requests (69%) falling into the “unique” category, a specific instance of a general category (“George III” as opposed to “kings”). Requests were refined by date (34%), location, action, event, or technical specification (the specification of image orientation or type of image). As a result of this research, Enser suggests (if the Hulton collection is characteristic of non-domain specific image banks) that a significant proportion of requests could be satisfied by automatic matching operations on picture captions. He notes that non-unique subjects (“purgatory”) need a more detailed indexing system, but concludes that subject indexing, because of its expense, is of low utility and that reliance upon experienced intermediaries will continue to be the norm. Armitage and Enser (1997) extend this work further by analyzing and categorizing additional user requests for still and moving images from seven libraries. This analysis forms the basis for a faceted framework for queries with four main categories (who, what, when, where) and three levels of abstraction for each category (specific, generic, and abstract). The authors note that the incidence of “unmediated” transactions is increasing and that some of the image libraries studied are engaged in the development of additional image delivery mechanisms and are in need of further information which could contribute to good interface design for image retrieval systems. Keister (1994) reported on reconstructed query logs at the National Library of Medicine. While some of these requests are for specific items, in contrast to Enser’s work, one-third to one-half of users’ queries were “image construct queries,” in which images are constructed with words describing both abstract concepts as well as concrete image elements such as specific objects. She concludes that, although abstract or emotional concepts may be used in end-user descriptions of images, the aesthetic and emotional needs of the user are highly subjective and not appropriate for the cataloger to consider. Rather, she suggests visual element cataloging supplemented by a visual surrogate. The following example demonstrates the four basic levels of image queries emerging from these studies: 1. Requests for a specific item (The Picture of Rouen Cathedral painted by Monet). 2. Requests for a specific instance of a general category (Rouen Cathedral). 3. Requests for a general topical or subject category of images (cathedrals). 4. Requests for images communicating a particular abstract concept or affective response (pictures of cathedrals symbolizing the power of religion in the life of an ordinary person of the Middle Ages). The need for both a generic classification and indexing to the greatest level of specificity are both apparent. These queries can also be further specified by the addition of other facets relating to either the content or production of the image, such as time or accessibility. This suggests that incorporating semantic term hier- ACCESS TO PICTORIAL MATERIAL 307 archies into image retrieval systems may be useful, but research needs to be done to determine whether such hierarchies should be created at the time of indexing, generated at the time of searching from a resident thesaurus, or used in a browsing mode as a tool of query refinement. 3.3. USERS Once again, only a few studies describe users of image collection in any detail. Enser described the Hulton Deutsch Collection as receiving requests from book, magazine, and newspaper publishers, advertising and design companies, television and audiovisual companies, and “other” (3%). Armitage and Enser’s later study spanned a number of collections having both a general user base (the National Film and Television Archive) and specialist audiences (“expert” users in such specific fields as natural history, town planning, engineering, art history, and medicine). The National Library of Medicine collection is used primarily by picture and publication professionals (50%) and health professionals (33%), with the remainder divided among the museum and academic community and the general public. The images are used in books, television documentaries, movies, educational projects, or for reference. Keister describes further how requests vary among the different user groups. The museum or academic community often has precise citations to the images it desires. Health professionals ask for images in keeping with the NLM’s orientation and images can be accessed by appropriate topic, such as a particular disease. Picture professionals (still picture researchers, TV, film, or media personnel), on the other hand, think visually and use art and/or graphics jargon describing specific image features desired (action shot, horizontal, color). In contrast to these groups, art historians need access to a different set of image attributes (Panofsky, 1962; Drabenstott, 1986; Stam and Giral, 1988). Bakewell (1988) describes one important need of this group: access to “visual traditions” and particular visual facets (e.g., occurrences of blue cloaks in eighteenth-century German painting) by means of a variety of indices. This report makes conclusions that sound familiar to the realm of textual searching needs: information needs of scholars are dynamic; the breadth, amount and quality of information sought as well as the manner in which the literature is browsed varies according to the stage of research; and access to different types of resources as well as different types of searching strategies is appropriate. 3.4. TASKS / USES Users come to image collections with image requests which relate to specific tasks. For instance, the museum, academic, and art historical communities have tasks which are related to research questions. Image content is often used as “evidence” in the proposing of an hypothesis or the construction of an argument. For instance, a 308 CORINNE JÖRGENSEN social historian may need images of urban intersections to support a line of research concerned with the effects of urban infrastructure on social relations. In this case, such qualities as composition, technique, lighting, or perspective are irrelevant, as may be the particular location depicted. In contrast, these image qualities are important to other communities such as art historians or perhaps journalists. Other users such as publishers or art directors have different tasks; they may want an image which will have a particular emotional effect or may wish to use an image as part of a package which will communicate a specific message. Thus, the nature of the task will play a large part in determining which specific image attributes may be requested. While such factors appear intuitive, given the time and money constraints which generally affect the indexing process, there is a need for research describing in more detail the relationships among tasks and attributes. Research describing both the relationships among different types of tasks and specific attributes and delineating the general types of image use tasks is needed. For instance, in her research, Jörgensen (1995) concluded that the strongest factor affecting the image descriptions (besides the actual visual content of the image) was the nature of the specific task being performed. The three describing tasks and the sorting task produced two markedly different distributions of data, and these differing results suggest that different sets of attributes assume importance based on type of task. Fidel (1997) conducted a small exploratory study with 100 requests from a stock photo agency. Using Jörgensen’s attribute classes, her analysis demonstrated that the distribution of attributes in these requests closely matched the distribution of attributes found in Jörgensen’s sorting task results, further supporting the importance of task in the process of searching for images. The author of the current paper suggests three major categories of images (which are not mutually exclusive) to provide a framework for further taxonomy building: • Data image (images in which raw data is captured and perhaps processed for visual clarity). • Informative images (images to which human intelligence has been applied to organize the visual material for communication of information). • Expressional images (pictorial, photo-realistic, abstract, subject to multiple interpretations). These major types reflect the original motivation (research, communication, expression) and process (attribute capture, attribute organization, attribute creation) in the production of an image. This is not to say that these categories of use are mutually exclusive. Indeed, an image can be created for one purpose yet be used effectively for a completely different purpose, and one image can potentially fulfill all three functions. Therefore, each of these categories may require different ACCESS TO PICTORIAL MATERIAL 309 approaches in terms of image access, and a single image may require multiple treatments. However, some notion of the various motivations and functionalities of an image can serve as an additional indexing device or perhaps as a partitioning device in a very large collection of images. Korf Vidal (1995) (81–82) presents a similar model derived from her analysis of image clusters produced by a spontaneous sorting task of images all depicting the same subject matter (the Brooklyn Bridge), but created in a wide range of media. The data revealed several distinct image clusters, which she suggests are organized according to a communicative continuum: images about the person who made the image (expressions/expressive); images about the culture in which the image was made (active messages); and images about the objects or persons depicted in the image (passive evidence). Fidel (1997) also describes the different uses of images and presents a continuum of use between two poles, the “Data Pole,” at which images (cartographic, medical, chemical structure) are used as sources of information, and the “Objects Pole,” at which images are needed as “objects” for some task, such as a magazine cover, an advertisement, or a picture in a book. She further describes image and search characteristics for each pole. For instance, at the “Data Pole,” users are looking for the smallest set which can provide the information needed, whereas at the “Objects Pole” browsing sets are needed. The author calls for additional research to further define and validate these characteristics. There is, in fact, evidence that users themselves recognize and describe images in terms which suggest such categorization. For instance, both Jörgensen and Korf Vidal found examples of such user recognition and description of the purpose of an image in their research: Korf Vidal: artistic, clever use of bridge as icon, commemorative, could be used for greeting cards, documentary, ephemeral uses, “framed art,” take-offs for profit (Korf Vidal, 1995), 74. Jörgensen: Advertisement, Big Postcard, Cartoon, Comics, Erotic Art, Postcards, War Footage, Children’s Art (Jörgensen, 1995). 3.5. SEARCHES Roddy (1991), (48), speaking of the issue of anticipating a searcher’s needs, comments: “One of the great failures of image access at present is its inability to provide reliable information on what might be called a typical session.” Studies focusing on visual search techniques and retrieval in multimedia systems have produced results which are difficult to interpret as the system itself often is a confounding factor (e.g. Dunlop, 1991). Nevertheless, research describing searching and the image attributes relevant for the process is needed for determining adequate representational structures for images. 310 CORINNE JÖRGENSEN Rorvig et al. (1988) conducted some early work on human image searching with the NASA Visual Thesaurus, which provided a retrieval interface based on the broad hypothesis that images can be somewhat robustly substituted for text and that access to images will reduce ambiguity in both term assignment and searching. Experimental results indicated that most users, while initially enamored with the ability to search using surrogate images as input, reverted to text-based retrieval functions as they became more expert. This suggests that while such a strategy may be useful with newer users, a simple one-to-one visual/verbal replacement strategy in searching may not be adequate for all searching. Two studies describe searching strategies in pictorial databases. Batley (1988) studied searching behavior of library staff, university students, and school children, and identified four visual information search strategies: Seeking, Focused Exploring, Open Exploring, and Wandering. She proposes that, given a flexible visual retrieval environment, users will adopt familiar search strategies and will engage in a range of search activities from the non-exploratory to the unstructured exploratory, in much the same way as in a traditional library environment. Hastings (1994) conducted an exploratory investigation of the search behavior of eight art historians (with specialties in Caribbean art) in both a manual and computerized environment. She found that searches became more complex in the computerized environment, and identified three types of search styles: BrowseSearcher, Subject Searcher, and Text-Searcher. Browse-Searchers created their own categories for the images and used more complex images, Subject Searchers imposed a preconceived classification scheme on the images and used textual information to aid in identification of objects and activities, and Text-Searchers worked primarily from textual information. Romer (1993) discovered five search and review patterns in her research and enumerates several visual thinking processes that were observed with professional photo editors. She found that visual thinking is stimulated by images, and that people often start to look for images by using images in either a random or directed search. A second observation was that images already selected provide the basis to continue a search. People desire to be able to use selected images to submit a request such as “Get me more like the ones I just found.” These patterns contributed to the construction of data records that provide access to two different “points of view,” that of editorial and advertising users. The search patterns discussed above broadly represent the two ends of a continuum; on one end is a focused, specific type of searching and on the other is a looser, exploratory, browsing type of searching. Fidel relates these search behaviors to the intended use of the image. While many information retrieval system designers have tried to accommodate several types of searches by providing flexible system manipulation, the question remains as to whether this need could also be addressed through new or expanded approaches to indexing or other forms of knowledge representation for images. 311 ACCESS TO PICTORIAL MATERIAL 3.6. SUMMARY These studies indicate that, just as in text searching, different types of queries, users, tasks, and searching strategies necessitate flexible retrieval mechanisms which can accommodate a variety of requests. Decisions that must be made in the selection of terms to describe images include whether to limit indexing to visual appearance or to incorporate interpretive and symbolic meanings and whether to index multiple hierarchical levels. Classification has traditionally been concerned with the first three levels, and numerous reasons are given for not addressing these fourth-level facets, which seem to be situated more within the individual user’s own interpretations. However, those working with providing access to image collections know that the fourth level is a key one and some researchers have argued for addressing both the “of” and “about” of an image. Beyond this broad categorization, image queries can be modified by a number of other more specific considerations, such as date, location, activity, event, image orientation, perspective or type of representation. The following chart (Jörgensen, 1995) summarizes some relationships among types of image queries and associated attributes as revealed in the literature reviewed. ART HISTORICAL SEARCH (context of production) Text-based Information creator title size material type nationality time period technique genre VISUAL CONTENT SEARCH (named object) Specific Object or Generic Object color size location texture shape orientation TOPICAL SEARCH (named person, place) Specific Instance or Generic Category time location event or activity EVENT SEARCH (named event) Specific Event or General Event Type time setting activity 312 CORINNE JÖRGENSEN AFFECTIVE SEARCH (named response) Emotion or Atmosphere expression color lighting composition CONCEPTUAL SEARCH (named concept) Abstract Symbolic Thematic Political Social Interpretive State (stable Or Change) any attribute The question of what is needed for image retrieval is inherently multidisciplinary, and effective solutions to system design will draw upon concepts and data from a number of fields, including classification research, information retrieval, computer science, and cognitive science. Those interested in these problems will also find relevant discussion in the fields of aesthetics, cognitive science, linguistics, and philosophy, although these are beyond the limits of the present discussion. Other reviews of the image indexing problem can be found in Rasmussen (1997) and Lancaster (1998). 4. Research Agenda Current research has focused on needs of specific user groups, searching behaviors, and image attributes. While much more research in each of these three areas is needed, a broader research agenda should also include investigation of the interplay and complementarity among these research efforts. Several researchers (Sledge, 1995) have proposed indexing of attributes to facilitate searching from particular points of view. Practical system implementations could match unique user group needs with pertinent attribute set indexing, and searching behaviors could also be accommodated within a flexible interface design. Other areas for a broad-based research agenda include integration and selected application of current text-based tools for image indexing, new representational techniques for attribute sets, specification of image taxonomies, and research into the nature of similarity in a visual environment. 4.1. USES OF CURRENT TOOLS As can be seen from the current review, there already exist a variety of subject heading lists and indexing guides for images, as well as more generalized classification systems for textual materials. Rather than lament the shortcomings of these systems when applied to visual indexing, it may be more productive to launch a ACCESS TO PICTORIAL MATERIAL 313 major research effort aimed towards a thorough review of these systems with the goal of specifying which parts of these systems are most useful within specific indexing contexts. This would need to be done within a conceptual framework which would be informed by previous and continuing research on image attributes, points of view, and user needs. There are several tantalizing aspects to this research, even though it remains in what some consider the perhaps “inappropriate” textual world. The appeal of using portions of existing general systems such as LC or Dewey is that one search could bring up both textual and visual materials on the same topic. Another intriguing idea is the building of some form of distributed indexing mechanism for images within such a conceptualized framework. Those who have knowledge and expertise in a system such as the AAT could provide terms which are appropriate to the visual item and to the point of view of their users, while others could contribute subject or topical access terms from LCTGM or Dewey. Experts in graphics could provide access to compositional or production aspects while scholars could provide interpretive analysis from realms as diverse as art history, cultural studies, and postmodernism. Such distributed indexing is currently taking place for other forms of media, such as movies, using the Internet. One interesting example of this distributed indexing approach for images is currently taking place at the Fine Arts Museums of San Francisco’s website (http://www.thinker.org/) which allows visitors to their web site to add indexing terms to images. 4.2. NEW REPRESENTATIONAL TECHNIQUES Among the problems of image indexing is the hierarchical nature of many of the classification tools, which limits the number and types of relationships which can be expressed in their vocabulary stricture. The AAT, in an attempt to address these problems, has evolved from a subject heading list to a hierarchical and faceted thesaurus. Other representational techniques which can represent complex image attribute relationships are needed, and schemas and semantic networks are two which are currently being investigated. Romer (1995) suggests the restructuring of hierarchical resources into semantic networks, which are structures that represent knowledge in an interconnected manner. Within a semantic network, it is possible to assign several relationships between terms with differing weights to provide a clear notion of the semantic strength between terms. Many of the limitations surrounding hierarchical and faceted thesauri may be eliminated by the use of a network structure. Semantic networks could provide the missing link between visual elements and their associated meanings. An example of this type of relationship is the use of “masks” to identify visual elements of race, gender, and class (Okon, 1995). For generalized retrieval, a fruitful approach may be to implement indexing attributes through the use of a schema-based template (such as the Expectation Packages discussed earlier), drawing upon a variety of controlled vocabularies and 314 CORINNE JÖRGENSEN providing some predefined “Points of View” which target specific attribute sets of interest to particular groups of users. 4.3. IMAGE TYPOLOGIES Just as textual materials can represent not only data but a wide range of human thought and creativity, so too can images represent a variety of perceptual content, from raw data, organized information, or creative illustration to “art.” And just as text may represent more than one type of communication at a time (for instance, poetry can still communicate information), so too can images be both creative and informative. At present, in the wake of the uniformity and lack of context which keyword searching has imposed on vastly dissimilar types of texts, we are only just beginning to understand that different texts and domains may have different search requirements (Albrechtsen, 1993), and that we still have a very imperfect model for textual searching. Thus, in addition to a sound knowledge of the full range of image attributes and an understanding of user search behaviors in a visual environment, the development of image typologies and definition of their content and function could provide heuristics for indexing particular types of images and guide image indexing in a more generalized retrieval context as well. Some very interesting research has been performed in this area already. Lohse et al. (1994) has developed a structural classification of visual representations (in contrast to a functional taxonomy, which focuses on the intended use and purpose of the graphic material). Eleven categories of visual representations emerged from the research: graphs, tables, graphical tables, time charts, networks, structure diagrams, process diagrams, maps, cartograms, icons, and pictures. These categories were evaluated on ten scales representing such attributes of the images as spatial-nonspatial, attractive-unattractive, numeric-nonnumeric, and concrete-abstract. In this and other research, there is evidence suggesting that photo-realistic pictures communicate the least amount of “information.” This suggests that this type of image is not usefully grouped with the rest of the image types, and supports the notion of “informative images” discussed above. Other work towards developing image typologies was discussed during a panel presentation at the latest conference of the American Society of Information Science.2 There needs to be more research aimed at developing image typologies concerned with structure and function, content and meaning. 4.4. DEFINING VISUAL SIMILARITY An understanding of image similarity features is needed before the “get more images like this one” search can supply the searcher with accurate results, yet similarity in a visual environment still remains to be defined. Early theories of similarity relied upon weighted feature lists, but researchers (Medin and Wattenmaker, 1987; ACCESS TO PICTORIAL MATERIAL 315 Lockhead, 1992) have noted that physical features do not predict classification performance. Jörgensen (1995), in a sorting task, provided data that demonstrate that similarity among images cannot be represented solely by perceptual attributes, but must take into account interpretive attributes as well. As discussed above, however, research also suggests that there is also some measure of consistency in naming objects and items which appear in images. These results prompt two lines of thought: (1) that definition of similarity in a visual environment must not be narrowly constrained; and (2) that the process of communicating about visual images within a particular cultural context may provide some constraints on naming image attributes. If this were so, it would enable a certain amount of “bravado” on the part of image catalogers in attaching abstract or affective terms such as “mysterious” or “gloomy” to an image. It would also suggest that assigning free-text terms or some form of distributed indexing methods may prove useful. Another larger question, which has barely been asked in the community of researchers investigating image retrieval, is the function of the process of searching for an image. If the goal of the process is to stimulate and enhance creativity rather than to conduct a precise and efficient search, then methods for browsing or perhaps for revealing unexpected connections or human-centered perspectives would in fact be more appropriate. Research scientists tend to forget that, in the humanities, the process is often at least as important as the product, and the process itself produces new knowledge and understanding. 5. Conclusion Further research in each of the areas discussed above will provide basic data which will inform and enrich image access system design and will hopefully provide a richer, more flexible, and satisfactory environment for searching for and discovering images. However, such research projects require the knowledge and theories of a variety of disciplines. What is also needed are mechanisms to ensure that communication among researchers in different disciplines takes place, in order to provide the broad perspective needed both to define and to solve the problems associated with accessing visual materials. Notes 1 For examples, see Corbis Images: http://www.corbisimages.com/; PNI’s PictureQuest: http:// www.pniltd.com/picturequest.html). 2 ASIS Panel discussion, “Theory and Practice in the Organization of Image and Other Visuo-Spatial Data: from Retrieval to Metadata,” Wednesday, October 28. Panel participants were Clifford Lynch, Abby Goodrum, Myke Gluck, and Corinne Jörgensen. 316 CORINNE JÖRGENSEN References Ahuja, N. “Texture analysis”. In Encyclopedia of Artificial Intelligence. Ed. S. C. Shapiro, New York: Wiley, 1992, pp. 1101–1115. Albrechtsen, H. “Subject Analysis and Indexing: From Automated Indexing to Domain Analysis.” The Indexer 18 (1993), 219–224. Armitage, L. H. and P. G. Enser. “Analysis of User Need in Image Archives.” Journal of Information Science, 23 (4) (1997), 287–299. Bakewell, E. Object, Image, Inquiry: The Art Historian at Work. Santa Monica, Calif., AHIP, 1988. Bates, M. J. “The Design of Browsing and Berrypicking Techniques for the Online Search Interface.” Online Review, 13 (1989) 407–424. Batley, S. Visual Information Retrieval: Browsing Strategies in Pictorial Databases, Ph.D., University of Aberdeen, 1988. Berlin, B. and P. Kay. Basic Color Terms. Berkeley, University of California Press, 1969. Besser, H. “Visual Access to Visual Images – The UC Berkeley Image Database Project.” Library Trends (38) (1990), 787–798. Besser, H. and M. Snow. “Access to Diverse Collections in University Settings: The Berkeley Dilemma.” In Beyond the book: Extending MARC for Subject Access. Eds. T. Petersen and P. Molholt, Boston: G. K. Hall, 1990, pp. 203–224. Chang, S. F. and J. R. Smith. “Visual Information Retrieval from Large Distributed Online Repositories.” Communications of the ACM 1997, 40(12) (1997), 63–71. Domeshek, E. and S. Kedar. Interactive Information Retrieval Systems with Minimalist Representation. AAAI-96: Thirteenth National Conference on Artificial Intelligence, Portland, OR, 1996. Drabenstott, K. M. Subject Access to Visual Resources Collections: A Model for Computer Construction of Thematic Catalogs. New York: Greenwood Press, 1986. Dunlop, M. D. Multimedia Information Retrieval, University of Glasgow, 1991. Ellis, D. The Derivation of a Behavioural Model for Information Retrieval System Design, University of Sheffield, 1987. Enser, P. G. B. “Query Analysis in a Visual Information Retrieval Context.” Journal of Document and Text Management 1(1) (1993), 25–52. Evans, H. Picture Librarianship. New York: K. G. Saur, 1980. Fidel, R. “The Image Retrieval Task: Implications for the Design and Evaluation of Image Databases.” The New Review of Hypermedia and Multimedia 3 (1997), 181–199. Frost, O. “The University of Michigan School of Information Art Image Browser: Designing and Testing a Model for Image Retrieval.” Knowledge Organization and Change. Ed. R. Green. Frankfurt/Main, Indeks Verlag, 5 (1996), 182–188. Gilbert, K. D. Picture Indexing for Local History Materials. Monroe NY: Library Research Associates, 1973. Gordon, A. S. The Design of Knowledge-Rich Browsing Interfaces for Retrieval in Digital Libraries, Ph.D., Northwestern University, 1998. Green, S. J. The Classification of Pictures and Slides. Denver CO: Little Books, 1984. Greenberg, J. “Intellectual control of visual archives: A comparison between the Art and Architecture Thesaurus and Library of Congress Thesaurus for Graphic Materials.” Cataloging & Classification Quarterly 16(1) (1993), 85–117. Gupta, A. and S. Santini. “In Search of Information in Visual Media.” Communications of the ACM 40(12) (1997), 34–42. Haralick, R. M. “Statistical and Structural Approaches to Texture.” Proceedings IEEE 67 (1979), 786–804. Hastings, S. K. An Exploratory Study of Intellectual Access to Digitized Art Images, Ph.D., The Florida State University, 1994. ACCESS TO PICTORIAL MATERIAL 317 Hibler, J. N. D. and C. H. Leung. Image Storage and Retrieval Systems. Proceedings -SPIE, the International Society for Optical Engineering, San Jose, CA., 1992. Holt, B. and K. Weiss. Proceedings of the 60th ASIS Annual Meeting. ASIS ’97, Washington, D.C., Information Today, Inc., 1997. Hourihane, C. “A Selective Survey of Systems of Subject Classification.” Computers and the History of Art. Eds. W. Vaughan, A. Hamber and J. Miles. London: Mansell Publishing Limited, 1989, pp. 117–129. Jörgensen, C. Image Attributes: An Investigation, Ph.D. Syracuse University, 1995. Jörgensen, C. Proceedings of the 6th ASIS SIG/CR Classification Research Workshop. Classification Research Workshop, Chicago IL, American Society for information Science Special Interest Group/Classification Research, 1995. Jörgensen, C. “The Applicability of Existing Classification Systems to Image Attributes: A Selected Review.” Knowledge Organization and Change. Ed. R. Green. Frankfurt/Main, Indeks Verlag, 5 (1996), 189–197. Julesz, B. “Textons, the Elements of Texture Perception and Their Interactions.” Nature 290 (1981), 9–97. Keister, L. H. “User Types and Queries: Impact on Image Access Systems.” Challenges in Indexing Electronic Text and Images. Eds. R. Fidel, T. B. Hahn, E. M. Rasmussen and P. J. Smith. Medford NJ, Learned Information, Inc., 1994. Korf Vidal, N. Experimental Image Taxonomy: An Inquiry into Spontaneous Image Organization, Master’s Thesis, Cornell University, 1995. Lancaster, F. W. “Indexing Multimedia Sources.” Indexing and Abstracting in Theory and Practice. Champaign IL, University of Illinois Graduate School of Library and Information Science, 1998, pp. 206–221. Lassaline, M. E. and E. J. Wisniewski. “Basic Levels in Artificial and Natural Categories: Are All Basic Levels Created Equal?” Percepts, Concepts, and Categories: The Representation and Processing of Information. Ed. B. Burns. New York: North-Holland, 93, 1992. Lockhead, G. R. “On Identifying Things.” Percepts, Concepts, and Categories: The Representation and Processing of Information. Ed. B. Burns, New York: North-Holland, 93 (1992), 109–143. Lohse, G. L. and K. Biolsi. “A Classification of Visual Representations.” Communications of the ACM 37(12) (1994), 36–49. Medin, D. L. and W. D. Wattenmaker. “Category Cohesiveness, Theories, and Cognitive Archeology.” Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorization. Ed. U. Neisser, New York: Cambridge University Press, 1987, pp. 25–62. Okon, C. “IBM’s Image Recognition Tech for Databases at Work: QBIC.” Advanced Imaging, 1995, pp. 63–65. Panofsky, E. Studies in Iconology. New York: Harper & Row, 1962. Parker, E. B. LC Thesaurus for Graphic Materials: Topical Terms for Subject Access. Washington, D. C.: Library of Congress, 1987. Rasmussen, E. M. “Indexing Images.” Annual Review of Information Science and Technology (ARIST). Ed. M. E. Williams, Medford NJ: Information Today, Inc., 32 (1997), 169–196. Roddy, K. “Subject Access to Visual Resources: What the 90s Might Portend.” Library Hi Tech, 9(1) (1991), 45–49. Romer, D. Getty Information Institute Online Conference on Digitizing Technologies, 1995. Romer, D. M. A Keyword is Worth 1,000 Images. Rochester, NY: Kodak, Inc., 1993. Rorvig, M. E. and C. H. Turner. The NASA Image Collection Visual Thesaurus. Proceedings. American Society for Information Science 17th Mid-Year Meeting, Ann Arbor, MI, 1988. Rosch, E. and C. B. Mervis. “Basic Objects in Natural Categories.” Cognitive Psychology, 8 (1976), 382–439. Seloff, G. A. “Automated Access to the NASA-JSC Image Archive.” Library Trends, 38(4) 1990. 318 CORINNE JÖRGENSEN Shatford, S. “Analyzing the Subject of a Picture: A Theoretical Approach.” Cataloging & Classification Quarterly 6(3) (1986), 39–62. Simons, W. and L. C. Tansey. A Slide Classification System for the Organization and Automatic Indexing of Interdisciplinary Collections of Slides and Pictures. Santa Cruz CA, University of California, 1970. Sledge, J. Points of View. Multimedia Computing and Museums. Philadelphia, Archives & Museum Informatics, 1995, pp. 335–346. Smith, E. E. and G. J. Balzano. “Nominal, Perceptual, and Semantic Codes in Picture Categorization.” Semantic Factors in Cognition Eds. J. W. Cotton and R. L. Klatzky, Hillsdale, NJ: Lawrence Erlbaum Associates, 1978, pp. 137–168. Soergel, D. “The Art and Architecture Thesaurus (AAT): A Critical Appraisal.” Visual Resources 10 (1995), 369–400. Stam, D. C. “The Quest for a Code, or a Brief History of the Computerized Cataloging of Art Object.” Art Documentation 8 (1989), 7–15. Stam, D. C. and A. Giral. “Linking Art Objects and Art Information.” Library Trends, 37(2) (1988), 117–264. Svenonius, E. Thesauri. Automatic Processing of Art History Data and Documents. Eds. L. Corti and M. Schmitt. Los Angeles: The J. Paul Getty Trust, 1984, pp. 33–48.
© Copyright 2026 Paperzz