DIAGRAMMATIC COMMUNICATION: A TAXONOMIC OVERVIEW 1 N. Hari Narayanan2 Technical Report CSE97-06 December 2, 1997 Visual Information, Intelligence & Interaction Research Group Department of Computer Science & Engineering Auburn University Alabama 36849-5347 USA http://www.eng.auburn.edu/department/cse/research/vi3rg/vi3rg.html 1 Appears in B. Kokinov (Ed.), Perspectives on Cognitive Science, Volume 3, New Bulgarian University Press, Sofia, Bulgaria, pp. 91-122. 2 Author to whom correspondence should be addressed. Email: [email protected] DIAGRAMMATIC COMMUNICATION: A TAXONOMIC OVERVIEW Abstract This chapter discusses research on the role of diagrammatic representations in facilitating communication. Diagrammatic communication is an overarching theme that encompasses not only diagrammatic representation but also diagrammatic reasoning and diagrammatic interaction between humans and computers. Interdisciplinary research on this topic has addressed issues of human diagram comprehension and diagrammatic reasoning from historical and psychological perspectives, as well as issues of diagram parsing, inference from diagrams and human-computer interaction via diagrammatic interfaces from theoretical and computational perspectives. Recently there has been a resurgence of interest in this topic in cognitive science. My aim in this chapter is to provide a window on various strands of recent research. The main contribution of this chapter is a taxonomic description of the multifaceted research on diagrammatic communication, categorized broadly as theoretical, psychological, and computational. This taxonomy is meant to serve as an introduction to those with an interest in the topic and who may be contemplating doing serious work in the area. In the course of describing a sample of current work in the field, open issues and fruitful directions for future research are also identified. 1 Introduction Evidence for humankind’s use of diagrams to represent and communicate information dates back to the cave drawings of prehistoric times. In fact, visual thinking (Arnheim, 1969; West, 1991) is considered by many to be a central aspect of intelligence. Diagrammatic representations predate the development of spoken and written language (Tversky, 1995). Eventually symbolic languages became the predominant tool for conveying information, but the use of diagrams for information representation is still widely prevalent in all areas of human intellectual endeavor. The role and utility of diagrams, and their advantages and disadvantages compared to textual representations, have fascinated scientists and philosophers from the time of Aristotle and Plato. Recently, researchers working in the interdisciplinary field of cognitive science have started looking at this issue from a variety of perspectives including computational and psychological. My aim in this chapter is to provide a window on these recent inquiries. This chapter, arising from an introductory tutorial on diagrammatic representations3 , is intended more for those with an interest in the topic and who may be contemplating doing serious work in the area, than for someone who is already familiar with the area and who may be looking for a complete treatment of the state-of-the-art. Consequently, the chapter provides introductory descriptions of a sample of various strands of current research on diagrammatic representation, reasoning and interaction. It is not a comprehensive survey, and therefore it is quite possible that some research literature may have been unintentionally omitted. But I hope the reader will come away with an appreciation of the multifaceted research into roles that diagrams play in human intellectual endeavors. The chapter is structured as follows. I begin by circumscribing the terms diagrams and diagrammatic representations for the purposes of discussion here, since there is no commonly accepted technical definition. Cognitive science research on diagrammatic communication is then classified broadly as theoretical, psychological, or computational. After that, the discussion 3 A course presented at the Third International Summer School in Cognitive Science, New Bulgarian University, Sofia, Bulgaria, July 21 - August 3, 1996. 2 plunges into developing subcategories of this taxonomy and describing details of a sample of corresponding research efforts. In each case, besides an overview of research, a set of references for further exploration are also provided. The concluding section serves to summarize the main ideas and discuss open issues for future research. 2 The What and Why of Diagrams What are diagrams? In other words, what characteristics distinguish diagrams from other forms of representation? And why are they considered useful? One must address these questions before ruminating on the utility of diagrams as representational tools, inference aids and facilitators of human-computer interaction. A common, if somewhat circular, definition is that diagrams are pictorial representations of information. This is not entirely accurate since the word “pictorial” carries connotations of similarity. Pictures typically depict objects in ways similar to their actual appearance. However, diagrams can represent information in more abstract form than a photograph or a painting. Various kinds of graphs and charts are good examples of this abstract form of diagrammatic representation. I will use the terms “diagram” and “diagrammatic representation” interchangeably in this chapter since all diagrams presumably represent something or the other. A first attempt at defining diagrammatic representation might be to say that it means “a representation, internal or external, in which topological, geometric or other visual properties are significant”. But this does not exclude English sentences as qualifying for the diagrammatic status since symbols of the English alphabet are distinguishable by virtue of their geometric properties. Similarly, photographs and video qualify as well even though we do not normally think of these as diagrams. I want to include maps, pictograms, different kinds of charts, engineering blueprints, architect’s sketches, etc. as examples of diagrams, while excluding text, photographs, video, etc. a b C AT c d Figure 1. Diagrammatic Representations? 3 So, what is a diagram? What is not a diagram? We may think that we intuitively understand this distinction. But careful thought will reveal that the distinction between what is diagrammatic, and what is not, to be a slippery one. On the face of it, I am sure people will agree that Figure 1a contains a word representing an animal, and that it is not a diagram. Figure 1b depicts a diagram which is an icon representing a hazard warning, typically found on electrical appliances. Figure 1d is another diagram which is an icon representing a service station selling petrol or diesel fuel. To those who know Japanese or Chinese scripts, Figure 1c contains an alphabetic symbol representing the concept of a confined (jailed) person while for those who do not know these languages it may look like a diagram representing something. I propose that for a representation to qualify as diagrammatic, it has to satisfy both of the following properties: 1. It contains one or more elements with semantically significant spatio-visual4 properties or relations. 2. At least one element (or a spatio-visual property of an element or a spatio-visual relation between elements) of the representation is not visually similar to the corresponding object (or property of the object or relation between objects) of the represented. The first requirement rules out text since spatio-visual properties of elements of text (the alphabetic symbols) serve only to differentiate them, but are not semantically significant - i.e., do not carry meaning. The second requirement rules out completely visually similar representations such as photographs and video. Under this definition, Figures 1a is not diagrammatic because it does not satisfy 1, whereas Figures 1b, 1c and 1d are diagrammatic. One could furthermore argue that Figures 1c and 1d are more diagrammatic than Figure 1b. More about this later. I will formalize this definition in Section 3.1. Now one may define “diagrammatic reasoning” as the process of comprehending and making inferences from diagrammatic representations, and “diagrammatic interaction” as the process of using actions on diagrammatic representations for human-computer communication. What makes diagrammatic representations fascinating as a subject of study (at least for me) is that while avoiding the complexity of those representations, due to their richness of detail, that are completely visually similar to the represented (e.g., photographs), diagrammatic representations still manage to utilize spatio-visual properties of constituent components in very non-trivial ways to effectively encode and convey information, and aid reasoning. Researchers working in this area are motivated by a common interest in how information can be represented, communicated, and used by humans and computers in ways that take advantage of the spatio-visual properties of diagrams, and how to characterize the syntax and semantics of diagrammatic representations. Sentential representations capture and convey information by virtue of the semantics of their constituent elements (e.g., words) and how these concatenate to form more complex structures (e.g., sentences). The meaning of such a representation is dependent upon the meanings and context of constituent elements, but not on their spatio-visual properties (such as location, color, font or point size). Diagrammatic representations, on the other hand, can not only be constructed out of meaningful and context-sensitive elements, but the spatio-visual properties and twodimensional configurations of the constituent elements of a diagram can also be exploited to encode and convey information. Thus, by moving from sentential to diagrammatic representations, one can exploit additional dimensions for representing information. This is the root of representational leverage that diagrammatic representations provide; and the source of fascination that such representations hold. It also opens up an entirely new realm of challenging issues of 4 The term spatio-visual is meant to include all visible properties such as color, texture, topological properties, geometric properties, and other spatial properties. 4 comprehension and reasoning. How can these kinds of representations be constructed, used for communication, comprehended by both humans and machines, and reasoned with? Research in this area addresses several questions, such as: X How can information be best represented using diagrams? X How can diagrammatic representations be automatically generated? X How do we comprehend and make inferences from diagrammatic representations? X How can diagram understanding and diagrammatic reasoning be automated? X How can diagrams be used for programming and interacting with computers? One approach to answering these questions is to begin by investigating how we, humans, create and use such representations. For example, one phenomenon that is quite commonplace in occurrence, but has proved somewhat elusive to scientific analysis, is mental imagery. It is not difficult for most of us to think of an occasion when we recalled an image in our mind in order to make an inference - imagining the living room while visiting a furniture store to see if the color and patterns of furniture will match the wallpaper is a typical example. What kinds of representations and reasoning processes are the mind using for this? Is the experienced immediacy of imagery a mere figment of one's imagination, or is there an underlying reality to it - a reality rooted in spatiovisual mental representations? There is a rich and colorful history of research by philosophers, psychologists and computer scientists on this. Besides mental imagery, such human-centered investigations may also include studying how scientists have created and used diagrams in the course of their investigations leading to major discoveries and inventions, or experimentally studying how people comprehend, reason with or create diagrams in the course of various problem solving activities. With the advent and spread of computer graphics and multimedia, there has also been research within artificial intelligence on how computers can be made to understand, reason with and generate various kinds of diagrams, and within the field of human-computer interaction on how diagrammatic representations can aid humans to communicate more effectively with computers through diagrams. Yet another approach has been to investigate the theoretical status and foundations of diagrammatic representations using tools of logic, algebra, information theory, etc. There is a rich body of literature on this too. I will discuss these various approaches in the rest of the chapter. 3 Which Diagrams? One comes across all kinds of diagrams in all walks of life. There are the simplistic diagrams that children draw, semantically rich sketches that architects draw, metrically accurate blueprints that engineers draw, and information laden graphs and charts that scientists draw, to name just a few. How can one make sense of, and bring some order to, this plethora of diagrammatic representations? This chapter is too short to even attempt to provide a comprehensive answer. Nevertheless, let us briefly consider some of the ways in which order can be imposed on diagram types. One way of classification is in terms of the spatial dimensions diagrams represent: 1D, 2D and 3D diagrams. Another way is to look at whether the representation is static - a fixed diagram, or dynamic - a changing diagram, often called an animation. A third way to categorize diagrams is based on the intellectual disciplines in which they are used: architectural diagrams, mechanical blueprints, weather maps, node-and-link diagrams of computer science, and so on. A fourth approach is to differentiate diagrams based on an abstract-concrete axis. A graph plotting the fluctuations in the Dow Jones Index over the last ten years will then fall on the abstract side, whereas the stick figure of a human drawn by a child will fall on the concrete side; but it is more abstract than the realistic sketch of a person drawn by an accomplished artist. 5 Various researchers have tried to characterize and categorize diagrams in various ways. In a theoretical approach, Engelhardt and colleagues (1996) propose six basic syntactic operations that use two-dimensional space in different ways to encode information, shown in the table below, which they claim are sufficient to characterize the syntactic structure of most diagrammatic representations. In other words, the internal structure of most diagrams can be de-constructed in terms of these operations. Syntactic operation random arrangement pathing unordered slotting ordered slotting sliding spatial mapping Explanation use space to separate different entities use space to encode topology, typically using connector symbols such as arrows dividing space into separate areas and assigning visual elements to these slots same as above, with the ordering of slots being semantically significant using metric properties of space to represent non-spatial information, e.g., graphs using spatio-visual properties to represent spatio-visual aspects A complementary empirical approach at classification, this time based on the cognitive structure of diagrammatic representations, is discussed by Lohse and colleagues (1994). They used the ratings provided by sixteen subjects, using ten rating scales, of sixty sample diagrammatic representations to arrive at the following classification: Type icons graphs tables graphic tables network charts time charts structure diagrams process diagrams maps cartograms pictures Explanation diagrams with a single intended interpretation, meant to stand as labels of things encode quantitative information using position and magnitude of geometric objects two dimensional arrangements of words, numbers, signs or their combinations like tables, but use spatio-visual properties such as shading to convey additional information use graphical entities to show relationships among components use a spatio-visual property to encode temporal data spatio-visual properties of these representation express spatio-visual properties of represented objects use graphical entities and their properties to express dynamic, continuous or temporal relationships and processes representations of physical geography spatial maps that show quantitative data realistic depictions of the represented Is there a single taxonomy of diagrams that settles the questions of what is, and what is not, a diagram, what kinds there are, and allows one to unambiguously categorize any given diagram? This is an open question. While the syntactic approach appears to be a powerful one for classifying diagrammatic representations, the claim about its wide applicability remains to be proven. The empirical approach begins to answer the question of whether and how we cognitively classify diagrammatic representations, but much more work with larger subject and sample populations needs to be done to derive a comprehensive taxonomy. 6 The interested reader may consult the following works that provide deeper forays into this issue. Gombrich (1968) discusses several psychological issues. Bertin (1981) describes graphics in terms of variables of the plane (location, texture, color, orientation, etc.) and considers ways in which information can be mapped to these variables. Goodman (1969) offers both a general account of representational symbol systems and a theoretical framework for analyzing diagrammatic representations in terms of syntactic and semantic criteria such as disjointness, differentiation, density and repleteness. 4 Scope of Research on Diagrammatic Representations Diagrammatic languages abound in human endeavors, and in some disciplines enjoy prominence comparable to that of textual languages. Many research areas have developed their own diagrammatic languages, for instance, mathematicians use commutative diagrams, physicists use Feynman diagrams, and computer scientists describe data structures with boxes and arrows. Once familiar with such a language, it becomes an extremely efficient tool for communication. While diagrammatic languages are very useful for people, and to some degree are also used by computers, it is important to note that these do not replace textual languages. Not even comic strips get away completely without text. Indeed, what one finds is a spectrum spanning from pure text, to text illustrated with diagrams, to diagrams annotated with text, and to purely diagrammatic languages. The broadest use of diagrammatic representations has been for communicating information, both among humans and between humans and computers. A graph of a function makes its characteristics much more explicit than a table of values of the same function even though informationally both are equivalent. Printed textual descriptions are frequently illustrated with pictures that serve to exemplify the ideas contained in the text, to provide different representations of the same information, or to complement what the text describes. In these cases, diagrammatic views of information allow the viewer to discover relations and characteristics that are often hidden in a textual representation. The three books by Tufte (1983; 1990; 1997) provide an excellent treatment of how to effectively use diagrammatic representations to communicate information. Communicating an idea using a diagrammatic representation requires not only representing the idea in a diagram by the communicator, but also comprehending the meaning of the diagram by the receiver. In many cases, when the meaning is not explicit or obvious, this requires reasoning on the part of the recipient. When the communication is between a human and a computer, it typically involves what is called human-computer interaction - i.e., explicit actions taken by the human on the diagrammatic representation displayed on the computer screen to communicate an operation to the computer, and a manipulation of the diagrammatic representation by the computer to communicate results of the operation. Diagrammatic communication thus requires generation, comprehension, reasoning and interaction. Three entities are involved in communication - the communicator, the recipient, and the diagrammatic representation. The communicator and the recipient (these roles switch during a discourse) may be cognitive or computational agents. Processes involved in the computational side are diagram parsing, diagram interpretation, program execution, and diagram generation or manipulation to convey results of execution. Processes involved in the cognitive side are diagram perception, comprehension, inference, and diagram generation or manipulation to convey results of inference. Figure 2 shows a model of diagrammatic communication between a computational and a cognitive agent. The utility of a diagrammatic representation in this model rests on two criteria: its computational tractability and cognitive effectiveness. A similar model applies to the case of human-human communication. 7 perception/comprehension parsing/interpretation inference execution creation/manipulation Cognitive Agent creation/manipulation Diagrammatic Representation Computational Agent Figure 2. Diagrammatic Communication Research in this area can be broadly classified based on the communication model above. At the top level, research can be divided into three categories: (1) theoretical investigations of the nature of diagrammatic representations, (2) investigations of the cognitive processes - perception, comprehension, reasoning, generation and manipulation, and (3) investigations of the computational processes - parsing, interpretation, compilation, execution, generation, and manipulation. Thus, one can begin to taxonomize research on diagrammatic communication5 as shown in Figure 3. The following subsections describe each category in more detail. Nature of diagrammatic representations Diagrammatic Communication Research Psychological aspects of diagrammatic communication Computational aspects of diagrammatic communication Figure 3. Primary Levels of a Taxonomy for Diagrammatic Communication Research 5 As should be clear by now, the term diagrammatic communication includes representation, reasoning and interaction. 8 Kulpa, in the only published survey of the field (1994), provides an introductory treatment of the field’s origins, rationale and basic ideas, along with an extensive set of references covering a number of related areas. This chapter complements that survey by emphasizing cognitive aspects and including more recent material. Moreover, it aims to provide, for the first time, a research taxonomy for the field. A well founded taxonomy can serve at least two purposes. It can provide a concise picture of the central issues of a field by facilitating the characterization of existing work, and spur new research by revealing areas that are sparsely covered and therefore ripe for research. My aim is to propose a preliminary taxonomy and open it to further discussion and elaboration. In fact, the most important feature of any taxonomy is that it will be extended and revised as the field progresses. Such a taxonomy is also a useful for providing a shared vocabulary for discussions and relative comparisons of various research efforts in the area. All of these can contribute to not only developing a better understanding of where we are in terms of current research, but also where we ought to be going. 4.1 Theoretical Research Research on the fundamental nature of diagrammatic representations includes efforts to define and differentiate such representations from other representations, characterizing the syntax and semantics of diagrammatic representations - how these encode information using graphical primitives and their spatio-visual properties and the relations that hold between the representations and the represented, analyzing formal properties of specific diagrammatic representational systems, and formalizing human-computer interaction through diagrams. Figure 4 shows the corresponding secondary levels of the research taxonomy. Defining and differentiating diagrammatic representations Characterizing diagrammatic representations Nature of diagrammatic representations Analyzing diagrammatic representational systems Formalizing diagrammatic interactions Figure 4. Secondary Levels of a Taxonomy for Diagrammatic Communication Research Defining and differentiating diagrammatic representations The discussion in Sections 2 and 3 has already touched upon this issue. While all of us presumably have an intuitive idea of what a diagram is, precisely defining what is and what is not a diagram turns out to be difficult since there are a wide variety of two-dimensional representations in use that are considered to be diagrams. One may define diagrammatic representations by specifying what is not a diagrammatic representation: all representations on a two-dimensional medium (e.g., paper, cathode ray tube) that are neither true depictions of their referents (like video and photographs are), 9 nor concatenations of abstract symbols belonging to some alphabet whose meaning derive solely from individual of groups of symbols (like text is), may be called diagrammatic. Other characterizations have been proposed. Russell (1923) captures a central distinction between diagrammatic and sentential representations thus: “There is a complication about language...namely that words which mean relations are not themselves relations...a map...is superior to language, since the fact that one place is to the west of another is represented by the fact that the corresponding place on the map is to the left of the other; that is to say, a relation is represented by a relation.” Sloman (1975) characterizes analogical representations, a kind of diagrammatic representations, as follows: “If R is an analogical representation of T, then there must be parts of R representing parts of T,... and it must be possible to specify some sort of correspondence, possibly context-dependent, between properties or relations of parts of R and properties and relations of parts of T.” What is worthy of note in this definition is that it is precisely the nature of this correspondence that differentiates different kinds of diagrammatic representations. Another definition is proposed by Stenning and Lemon (1997): “A diagrammatic representation is a planar structure in which representing tokens are objects whose mutual spatial relations are directly interpreted as relations in the target structure.” The working definition I will use for this chapter is a formalization of the one presented in Section 2. A pictorial representation R consists of a set of graphical primitives P, and a set of spatio-visual relations Vn defined over one or more graphical primitives. Spatio-visual properties of individual primitives, such as position, orientation, size, shape, color, texture, etc., are considered as unary relations. Thus, R = {P, Vn}. The key aspect of any representation is what is being represented. It is typically a state S of the world consisting of a set of entities E, and a set of relations Rn defined over one or more entities (again, attributes of individual entities are unary relations), i.e., S = { E, Rn} and R represents S. R is a diagrammatic representation DR if 1. p D P such that a spatio-visual relation in which p participates, vp D V n, represents an entity or relationship in S, 2. p D P or vp D Vn representing an entity or relationship in S, such that it is not visually analogous to its referent. The creator of a diagrammatic representation has available different kinds of graphical primitives such as geometric elements, a variety of spatio-visual properties of individual primitives such as shape, color, texture, location, etc., and a number of spatio-visual relations such as adjacency, connectedness, etc., that can all be used to carry meaning. There are a number of studies that have attempted to enumerate and classify visual representations (Bertin, 1983; Lohse et al, 1994), and prescribe systematic ways encoding information using P and Vn (Bertin, 1981; Engelhardt et al, 1996). Nevertheless, a comprehensive categorization of diagrammatic representations and enumeration of the various ways in which their spatio-visual properties and relations can be used to encode information is still lacking. This, I believe, is a very fruitful area for future research. Characterizing Diagrammatic Representations Diagrammatic representations represent states of the world. In other words, a state of the world, describable in terms of a set of entities and relations, can be mapped to a corresponding diagrammatic representation consisting of graphical primitives and spatio-visual relations. The domain, range and type of this mapping are important. It may be one-to-one, one-to-many, manyto-one or many-to-many. One-to-one E C P and Rn C Vn mappings are common. For example, algorithm animation systems that show the inner workings of algorithms using animated diagrams 10 (Brown & Sedgewick, 1985) map data items to geometric primitives such as circles, size of data items to areas of the primitives, and relative positions of data items in a data structure to relative spatial locations of the corresponding graphical primitives. Other kinds of mappings may be difficult to comprehend without training unless based on established conventions. Venn diagrams represent a many-to-one mapping since a single circle represents a set of many elements in the world. Network and tree diagrams use a Rn C P mapping: the “connected” relation is represented by a graphical primitive, typically a line. Properties of different kinds of S C DR mapping are examined by Gurr (1997). This work provides a very promising foundation for characterizing diagrammatic representations based on the different kinds of correspondences between parts and relations of the representation and the represented that Sloman (1975) alluded to. When one is diagrammatically representing states of a world, there are two mappings of interest: world-to-representation and representation-to-world. Consider the first mapping in which states S = { E, Rn} of the world are mapped to diagrammatic representations DR = {P, Vn}. This mapping O is homomorphic iff RDRn <e 1, ... ,en>DEn: <e 1, ... ,en>DR iff <O(e 1)DP, ..., O(en)DP> D (O(R)DVn). For example, consider the world of integers with the binary less-than relation defined over them. We map this world to diagrammatic representations in which integers are represented by themselves and the binary less-than relation is mapped to arrows. Then S = {{1,2,3}, {{(1,2),(2,3),(1,3)}}} will be mapped to DR = {{1,2,3, A}, q} with the constraint that any two integers will be connected by an arrow going from the smaller to the larger, as shown in Figure 5. This is a homomorphic mapping. The mapping O is one-to-one if ( e1,e2DE: (O(e1)=p1 and O(e2)=p1) e1= e2) and ( r1,r2D Rn: (O(r1)=v1 and O(r2)=v1) r1= r2). In other words, distinct entities and relations of the world are not mapped to the same graphical primitive or relation, or every graphical primitive or relation in the DR represents at most one entity or relation in the world. The mapping of integers above is one-to-one. The mapping O is onto if ( pDP, eDE: O(e)=p) and ( vD V n, rD Rn: O(v)=r). In other words, if every graphical primitive and visual relation in the DR represents an entity or relation in the world, the mapping is onto. The mapping in the previous example is onto. The mapping O is isomorphic if it is homomorphic, one-to-one, and onto. The previous example therefore illustrates an isomorphic world-to-representation mapping R n. These same notions apply to the reverse mapping. That is, a representation-to-world mapping may be homomorphic, one-to-one, onto, or isomorphic. The representation-to-world mapping of the DR in Figure 5a is also isomorphic. Gurr (1997) calls a DR lucid if the corresponding SADR mapping is one-to-one, sound if this mapping is onto, laconic if the corresponding DRAS mapping is one-to-one, and complete if this mapping is onto. 2 3 1 Figure 5. An isomorphic DR A different approach is proposed by Wang and colleagues (Wang, Lee & Zeevat, 1995). They propose a theoretical construct called a signature morphism, which is a formally derived one-to-one mapping between the signature of a DR and a state of the world. A signature consists of a set of types with a partial order, a set of functions including instances of types and their attributes, and a 11 set of predicates that specifies relations between instances. The signature morphism is an {S C DR: one-to-one E C P and Rn C Vn} mapping. Another interesting characterization of DRs can be derived from the theoretical framework provided by Goodman (1969). He defines a character class as an equivalence class of inscriptions. Compound characters are permitted, so it is possible to view one DR as a character belonging to a class. A compliant class is an equivalence class of entities in the world whose members are denoted by members of some character class. It may be said that a character class represents its compliant class. A language then is a set of character classes and their associated compliance classes. The following five properties are required of languages that are notational systems: 1. For any inscription belonging to the language, it must belong to at most one character class. 2. There must be some finite difference between inscriptions belonging to different character classes. 3. All inscriptions of a character denote the same compliance class. 4. No two different character classes may denote the same compliance class. 5. There must be some finite difference between different compliance classes. The DR in Figure 5 is a notation. Many diagrammatic representational systems in common use (e.g., many kinds of schematics) can be seen to be notational systems. Analog systems are nonnotational languages that violate properties 2 and 5. Neither the character classes (syntax) nor the compliance classes (semantics) are finitely distinguishable. Analog systems are syntactically and semantically dense. Syntactic density implies that it is theoretically possible to find another character class between any two character classes, and semantic density implies that it is possible to find another compliance classe between any two compliance classes. Maps drawn accurately to scale (and which therefore permit extrapolation) are an example of an analog diagrammatic representational system. Consider such a map in which a red dot indicates your current position. If this map is a character that denotes your current position in the world, and if a different placement of the dot on the map is another character denoting your position in the world in the past, then it is possible to find a third character in which the red dot is somewhere between the previous two locations and which denotes an intermediate position of yours in the world. The notions of notational and analog diagrammatic systems serve to characterize diagrammatic representations. The implications of such a characterization for diagrammatic communication (e.g., are analog systems easier to comprehend? are notational systems easier for computers to deal with? etc.) provide an excellent open area of research. Sometimes diagrams represent dynamic processes in the world as well. These result in changes of state, i.e., creation, deletion or modification of entities - 6E and of relationships - 6R n, i.e., 6S = {6E, 6R n}. When static diagrams are used to represent state changes, different disciplines use specific graphic symbols, agreed upon by convention, to represent the change. The use of arrows to depict motion is an example. Static diagrams are not the only means for representation of change. One can encode and convey information in dynamic diagrams as well. Dynamic diagrams, or animations, have been effectively used for a long time by cartoon movie makers to tell stories. With the availability of computer graphics techniques on personal computers, creating animations has become much more widely accessible. How can one characterize the semantics of such dynamic DRs? The dynamic syntax of a DR specifies how it may be transformed. Such transformations may include creating, deleting or modifying graphical primitives - 6P and changing the attributes of graphical primitives and changing the spatio-visual relations between graphical primitives - 6V n, in a DR. The conditions and constraints under which state transitions occur in the world that is being represented, and how these affect the objects, their attributes and relations, may be mapped to conditions and constraints under which a DR can be transformed to another and the nature of this 12 transformation. In other words, 6E and 6R n can be mapped to 6P and 6V n. When the world being represented is continuously changing, this mapping requires quantization of the continuous state changes so that a continuous change can be represented by a set of discrete DRs. An example is the time display employed by digital watches that simulate analog dials using liquid crystal displays so that the second hand has only 60 meaningful locations on the dial. The domain, range and type of this mapping are important. It may be one-to-one, one-to-many, many-to-one or many-to-many. The dynamic semantics of the DR is captured by this mapping. Typically this semantics is implicitly captured in the graphical procedures or rewrite rules employed by the system doing the animation. Repenning (1995) deviates from this practice by considering how to extend rewrite rules to explicitly capture dynamic semantics. The previously discussed theoretical notions seem to be applicable to this mapping of dynamics as well. But there is at present very little research on the dynamic syntax and semantics of DRs. While typically 6E is mapped to 6P and 6R n is mapped to 6V n, other mappings are possible. There is no a priori reason for mapping static and dynamic aspects of the world to static and dynamic aspects of the DR respectively. The static syntax of a DR consists of the graphical primitives and spatio-visual relations. Its dynamic syntax specifies how DRs may be transformed. This describes the representations. The represented, the world, includes both states of the domain in terms of entities and relations, its static semantics, and state transitions, its dynamic semantics. The central issue for the construction of a DR then is how to represent the static and dynamic semantics of the world using the static and dynamic syntax of the DR. There are four possible mappings: (1) S C DR, (2) 6SC 6DR, (3) S C 6DR, and (4) 6S C DR. (1) and (2) are commonly occurring mappings. For example, the CARTOONIST program (Hübscher, 1997) represents a microworld consisting of moving balls and stationary walls by mapping balls and walls to circles and rectangles on a computer display, and mapping the motions of balls to the corresponding motions of circles on the display. One example of (3) is the use of blinking or flashing graphical objects in graphical user interfaces to attract the user’s attention to some state of the system. An example of (4) is the use of arrows, static graphical primitives, to denote world dynamics such as motions or forces. Each of the above four possibilities include 4 possible mappings, since S, 6S, DR, and 6DR are describable by two sets of items each. Each of these mapping may in turn be one-to-one, one-to-many, many-to-one or many-to-many, providing a total of 64 possibilities. These analyses provide the beginnings of a theoretical foundation for characterizing the static and dynamic syntax and semantics of diagrammatic representations. To learn more about practical, psychological, and theoretical aspects of worldCDR mappings, consult the following: Bertin (1981), Gombrich (1968), Marriott & Meyer (1997), Tufte (1983; 1990; 1997), and Wang (1995). Analyzing diagrammatic representational systems Most areas of human intellectual inquiry have developed their own diagrammatic notations with corresponding diagramming conventions. Schematic diagrams used in mechanical engineering, structural diagrams that civil engineers use, diagrammatic forms of the periodic system and structures of chemical elements, Feynman diagrams used in physics, node-and-link diagrams popular in computer science, and weather maps are but a few examples of the multitude of such diagrammatic representational systems in use. Scientists’ use of law encoding diagrams, a formal diagrammatic representational system useful for encoding laws or principles of a domain using diagrammatic structures, have been investigated (Cheng, 1996a), and this representational system has been used in instruction (Cheng, 1996b). Euler’s system of using circles to represent inclusion of elements in sets, and its role in syllogistic reasoning, is discussed in (Stenning & Oberlander, 1995). A similar notation, Venn diagrams, is thoroughly analyzed by Shin (1995). Systematic studies of such specific representational systems, aimed at deriving their formal properties and 13 uncovering their cognitive benefits (their adoption by a community indicates that they must aid diagrammatic communication in some way or the other), are relatively few. Consequently, this is another avenue for future research. Formalizing diagrammatic interaction Bottoni and colleagues (Bottoni et al., 1997) have begun to formalize the processes of humans interacting with a computer system through a graphical interface. Their theoretical characterization is based on the communication perspective that Figure 2 illustrates. They consider each state of the graphical user interface (appearing as an interactive diagram on the computer screen) as a component of a sentence of a visual language. Each visual sentence is formally specified as a 4tuple: the image on the computer screen, a description of what the image means (i.e., a description of its programmatic implication for the underlying computer system), an interpretation function from image to description, and a materialization function in the reverse direction. Given this theoretical framework, visual sentences are characterized in terms of whether components of every image in every visual sentence can be interpreted in terms of programmatic components, whether every programmatic component has an associated image component visible on the display, and whether the user can interact with, and receive feedback from, every image component that is visible. This leads to a class hierarchy of visual languages for interaction. 4.2 Psychological Research Psychological research on diagrammatic communication spans a wide variety of investigations. The cognitive processes involved in the use of DRs are those of perceiving the diagram, comprehending its meaning, making inferences about the state of affairs depicted by it, and manipulating the diagram or generating new ones to convey results of reasoning. Recent research efforts have focused on developing cognitive models of diagram comprehension and diagrammatic reasoning through experimental investigations of how people understand various kinds of diagrams, sometimes in conjunction with other kinds of representations such as text, and make inferences or solve problems. There are also studies of how people generate and manipulate external diagrams in the course of problem solving activities. Another strand of research has looked at educational issues and implications of using diagrams in textbooks and instruction. Then there is historical research, primarily concerned with visual thinking and its reported role in the history of scientific discoveries. Issues of diagram perception fall within psychological and physiological research on visual perception, which is excluded from this chapter. Figure 6 shows the corresponding secondary levels of the research taxonomy. Mental imagery Comprehending and reasoning with diagrammatic representations Psychological aspects Generating diagrammatic representations Educational uses of diagrammatic representations Historical investigations of diagram use Figure 6. Secondary Levels of a Taxonomy for Diagrammatic Communication Research 14 Mental Imagery Research on mental imagery provides one foundation for recent research into the psychological aspects of diagrammatic reasoning with both external and internal diagrams. However, since it is not directly related to diagrammatic communication, I will not discuss it further. There is an abundance of resources that the interested reader can tap to learn more about the history of research on mental imagery, particularly the famous imagery debate about whether analog or propositional mechanisms underlie the phenomena of imagery (Block, 1981; Cornoldi & McDaniel, 1991; Finke, 1990; Kosslyn, 1980; 1981; 1994; Paivio, 1971; Pylyshyn, 1981; Roskos-Iwoldsen, et al., 1993; Tye, 1991; Yuille, 1983). Comprehending and reasoning with diagrammatic representations How does one comprehend a diagram? Diagrams are typically combined with textual explanations, as most textbooks of scientific disciplines show. How does one understand such mixed-mode descriptions? Researchers have examined these fundamental questions in the context of mechanical device diagrams such as the one shown in Figure 7 (Hegarty, 1992). Figure 7. A Pulley System One model of diagrammatic comprehension and reasoning that has emerged from this research (Narayanan & Hegarty, 1997) postulates that diagram comprehension is a constructive process in which the individual attempts to use his or her prior knowledge of the domain, information presented in the diagram, and his or her reasoning skills to build a mental model of the situation or artifact described in the presented materials. It can be seen as an extension of models of text 15 processing that view comprehension as the construction of a mental model of the referent of the text (e.g., van Dijk & Kintsch, 1983). According to this model comprehending and reasoning with mechanical device diagrams, possibly with accompanying text, involves the following stages (not necessarily occurring in the given order) - decomposition, recomposition, determination of activity propagation paths, and dynamization. Details of this model are shown in Figure 8 and explained below. Decomposition of the Device's Diagram Construction of a Static Mental Model Making Representational Connections Making Referen tial Co nnections - between representations of visual elements and prior knowledge about components de picted by the elements - between verbal and visual e lements in external displays with the same referent - between representations of different components - between visual elements in e xternal displays with the same referent - between elements in external displays and internal representations of their referents Determination of Cau sal Activity Propagation Paths in the Device Construction of a Dynamic Mental Model Men tal Animation of Static Model Rule-based Inference of Component Behavio rs Figure 8. A Model of Diagrammatic Comprehension and Reasoning Decomposition. Decomposition involves parsing the diagram into its elementary units. Diagrams of mechanical devices are made up of elementary shapes, such as rectangles, circles and cylinders, which represent objects such as pistons, gears and tubes. The first step in comprehension is to parse the connected diagram into these elementary shapes, i.e., units that correspond to objects. This process is analogous to identifying discrete words and clauses in a continuous speech sound and probably relies largely on perceptual mechanisms of object recognition (Biederman, 1987; Marr & Nishihara, 1978). Recomposition. This involves constructing a static mental model of the referent of the diagram (e.g., the device that the diagram represents) by making appropriate representational and referential 16 connections in memory. There are two types of representational connections: connections to prior knowledge and connections to the representations of other machine components (Mayer & Sims, 1994), and there are referential connections among elements of the external and internal representations. Connections to prior knowledge. F one must recognize the components, that is, make representational connections between the identified diagrammatic elements and prior knowledge about their real-world referents - a process analogous to lexical access in language comprehension. For example, one might represent that a rectangle stands for a piston. Prior knowledge can also provide additional information about components, such as what these are typically made of and if they are rigid or flexible. This information is valuable in making inferences about how components move and constrain each other’s behaviors. Connections to the representation of other machine components. Second, one must internally represent the spatial relations (indicated in the diagram) between different device components by building connections between the internal representations of these components. In understanding how a device works, information about the spatial relations between mechanical components forms a basis for inferences about the motions of components, because these spatial relations influence how components affect and constrain each other’s motions. Knowledge of spatial relations also aids in guiding the reasoning process along the chain of causality in the device (Hegarty, 1992; Narayanan, Suwa & Motoda, 1994b). Making referential connections. When diagrams are accompanied by text (as is usually the case) an additional stage in comprehension is that of resolving coreference between the two media, i.e., making referential links between a noun phrase in the text (e.g., "the piston") and the diagrammatic unit that depicts its referent (e.g., a rectangle) (Novak & Bulko, 1993). This step is crucial to constructing an integrated representation of the common referent of the text and diagram in memory as opposed to separate surface-level representations of the text and diagram. Making referential connections is also a necessary process when viewers have to integrate information from multiple diagrams of the same device (e.g., two schematic diagrams showing two different cross sections of the same device) or to construct an internal 3-dimensional representation of the device from diagrams showing different perspective views. Another kind of referential connections associate elements of the external representations with the corresponding elements of one’s internal representations. Determining the causal propagation of activities in the device. The previous stages help create a static understanding of the device that the diagram represents. However, if one is asked to predict how such a device operates, it triggers diagrammatic reasoning to infer the device’s dynamics and kinematics. It has been found that people tend to reason about a device’s operation, using its diagram, along the paths of causal propagation in the device (Hegarty, 1992; Narayanan, Suwa & Motoda, 1994b). Therefore, a stage of identifying the potential causal chains of events in the operation of the device seems necessary for successful diagrammatic reasoning. Dynamization. This is a word I coined to denote the process of converting the static mental model constructed as a result of diagram comprehension into a dynamic one, i.e., incorporating inferences about the operation of the device into the existing mental model. This is accomplished by inferring and integrating the dynamic behaviors of individual components. This process involves both mental visualization of the component behaviors and rule-based inference. Cognitive and computational models (Hegarty, 1992; Narayanan, Suwa & Motoda, 1994a; 1994b) suggest that this is an incremental process (as shown in Figure 9) in which the reasoner considers the components or subsystems individually, assesses the influences acting on each, infers the resulting behavior of each, and then proceeds to consider how this behavior affects the next component or subsystem in the causal chain. 17 Select the most recent hypothesis about a component's behavior from working memory Retrieve prior knowledge, if available, about the component and its hypothesized behavior Scan the diagram to retrieve information about spatial relations that exist between this and other components Generate new hypotheses BY rule-based inference applied to prior knowledge and spatal relations between components internally visualizing the component behavior ABOUT which other components are affected by this component Add the new hypotheses to working memory Figure 9. Diagrammatic Reasoning about Device Behaviors Inferring the motion of each component can involve either rule-based reasoning processes that utilize prior knowledge or mental animation processes that detect component interactions from a mental visualization of component behaviors. In some cases, a rule of mechanical reasoning is available because it has been explicitly learned, or generalized from a series of trials in which simulation processes were used (Schwartz & Black, 1996a). An example of such a rule is that every other gear in a gear chain turns in the same direction. In other cases, no such rules are available, and inference of component behavior is based on a mental simulation of the component behaviors (Narayanan, Suwa & Motoda, 1995b; Schwartz & Black, 1996b). As a result, constructing a dynamic mental model of a device is best thought of as a hybrid reasoning process in which people can either use rule-based or imagery-based inference processes, depending on their knowledge of relevant rules of mechanical inference, and beliefs about the efficacy of the two types of reasoning processes in a particular situation (Schwartz & Hegarty, 1996). One implication of this model is that prior knowledge is critical to accurate diagrammatic reasoning. Lowe (1994b) found that prior knowledge, such as the level of domain expertise and knowledge about diagramming conventions, determined the difference between expert and novice performance in a diagrammatic prediction task. It is not sufficient to have relevant knowledge, but one must be able to retrieve and apply it appropriately during diagrammatic reasoning. Narayanan and colleagues (Narayanan, Suwa & Motoda, 1994b) found that visual elements of a diagram help cue and retrieve relevant prior knowledge from long term memory during diagrammatic reasoning. 18 Koedinger and Anderson (1990) propose that prior knowledge in the form of diagram patterns aids experts in solving geometry proof problems stated in terms of diagrams. This model suggests that the spatio-visual primitives and relations in a good diagrammatic representation should be explicit and recognizable. There should not be elements that are not relevant - i.e., distracting details should be omitted. In addition, configurations of primitives in a diagram must be amenable to easy and unambiguous decomposition. The reverse mapping from P and Vn of the DR to E and Rn in the world should be obvious or easily learnable. Furthermore, this model indicates the potential cognitive utility of meaningful animations. Narayanan and Hegarty (1997) describe a project aimed at developing guidelines based on this model for the design of hypermedia information presentation systems in the domain of explaining how machines work. This project takes research on diagrammatic reasoning into the domain of human computer interaction by proposing how one can apply what is known about diagrammatic reasoning to the design of interactive hypermedia information presentation systems. The model above applies to concrete diagrams that depict components of the object being represented. So it is not applicable to comprehension of abstract diagrams. Graphs are a particularly abstract type of diagram, and it has been suggested that comprehending graphs is a serial and incremental process similar to text comprehension (Shah, 1995). Generating diagrammatic representations Most of us have drawn doodles - freehand sketches - during the course of thinking about some issue or solving a problem. For most designers, whether they be artists, architects or engineers, these sketches are not mere idle creations. They are external manifestations of the cognitive processes involved in synthesis, and play a significant role in the creative process. Research on this facet of diagrammatic reasoning, namely, how drawing aids the creative process, is so far mostly confined to studies of architects and graphic designers. From a study of twelve professional designers, Goel (1995) concludes that the syntactic and semantic density of freehand sketches helped designers in fluidly transforming elements of their sketches and correspondingly facilitated transformations of their design ideas. In other words, freehand sketching aids creative designing. Suwa and Tversky (1997) studied architects to find that they perceive and think about three kinds of information during design sketching: spatial relations among elements, emergent properties of elements and functional implications. Moreover, the design process itself proceeded in cycles involving two distinct kinds of cognitive activity opportunistically shifting attention to a new design topic or consecutively exploring a series of related issues. Expert architects were better able to exploit sketches for the second activity, thereby developing more complex design ideas. Emergent properties of the sketches seem to have facilitated focus shifts whereas spatial relations encouraged exploring related ideas. Goldschmidt (1991) observes that “seeing as” and “seeing that” are two ways of inspecting sketches drawn during design. The former is a powerful means of visual thinking for the designer to gain access to related mental images that may potentially trigger new ideas in the design problem being solved. Akin and Lin (1995) observe from an experimental study that novel design decisions most often occurred when designers were engaged in three activities - drawing, thinking and examining. Do and Gross (1996; 1997) discuss how diagrams, drawing, and inspecting sketches function as vehicles for design reasoning in architecture. For a children’s developmental perspective on drawing, see Van Sommers (1984). Qin and Simon (1995) report a study in which diagrams drawn by subjects, along with other evidence, were used to investigate how they formed and used mental images in the course of trying to understand complex technical material on relativity. They found, similar to Akin and Lin, that forming images, watching images and reasoning were the three main activities, and that the kind of images formed had an influence on problem solving performance. The paucity of literature on the role of diagram 19 generation in domains other than architecture clearly indicates that this is an issue that deserves future research attention. Educational uses of diagrammatic representations Are diagrams useful for instruction? Judging by the abundance of textbooks containing diagrams, it seems that they certainly are. Diagrams are very commonly used in technical domains to introduce students to key aspects. It is assumed that the selectivity and graphical presentation of diagrams will provide instructional benefits. However, many recent studies, a sample of which I review here, indicate that diagrammatic representations can neither be a panacea nor replace text and other components of good instructional practice. Weather maps are relatively simple diagrams that encode complex atmospheric processes. By examining a weather map, professional meteorologists are able to predict how it will evolve as a result of these processes. This is a sophisticated kind of diagrammatic reasoning due to the complexity of the processes represented by the diagrams. Lowe has been investigating how novice students make sense of weather maps compared to expert meteorologists (Lowe, 1989; 1993; 1994a; 1994b; 1996a). He discovered that students, lacking in domain-specific knowledge, interpreted these diagrams based on low level graphical characteristics alone and their interpretations were constrained by the diagram’s selectivity and were therefore spatially and temporally impoverished. Experts, on the other hand, used rich domain knowledge to elaborate the structures present in diagrams to construct interpretations incorporating the broader meteorological situation implied by the diagram. When dynamic diagrams interactive animated weather maps were employed to counter the problem, mixed results were obtained (Lowe, 1996b). It was found that the realistic appearance of dynamics shown by the animations resulted in novices’ inappropriate attributions of cause-effect relations to entities that were not so related. Hyperproof (Barwise & Etchemendy, 1994) is a system that was designed to teach logic to students by allowing students to construct logic proofs assisted by the computer. Its representations consist of geometric diagrams and logic sentences. Its diagrams show configurations of three-dimensional geometric objects. In studying how well it helps students learn logic in comparison with traditional teaching methods, Stenning and colleagues (Cox, Stenning & Oberlander, 1995; Stenning, Cox & Oberlander, 1995) found large individual differences between students in how they respond to teaching in the graphical and sentential modalities, and those differences extended to self-generated external representations produced during untutored problem solving. Students classified as visualizers performed better than students classified as verbalizers following a logic course taught using Hyperproof. Interestingly, visualizers did not show a preference for the graphical modality over verbalizers. But they were more adept at strategically switching between diagrammatic and sentential representations compared to verbalizers. Researchers in software visualization in general (Price, Baecker & Small, 1993), and algorithm animation in particular (Brown & Sedgewick, 1985), strive to create static and dynamic diagrams for displaying the structure and operation of algorithms and software for the benefit of humans. In the case of algorithm animations, the target audience is students. The implicit hope of early designers of algorithm animation systems had been that these would prove to be beneficial instructional systems that make it easy for students to learn what was otherwise conceptually complex material. This hope was based on the assumption that animated diagrams, by providing concrete visualizations of abstract mathematical operations, would automatically make those more comprehensible. Recent research on the cognitive effectiveness of these animations has, however, also unearthed mixed results (Byrne, Catrambone & Stasko, 1996; Lawrence, Badre & Stasko, 1994; Stasko, Badre & Lewis, 1993). The researchers concluded, as in the case of weather maps, that students’ prior knowledge was an important determinant of the benefits they derived from an animation. Too little of it hampered a student’s ability to understand the algorithm-to-diagram mapping and use it to comprehend the operation of the algorithm; too much of it made the 20 animation redundant and uninteresting to a student. They conjecture that for novice students comprehensive motivational instruction needs to accompany the animation, animation design needs to be tied closely to instructional goals, and that allowing students to interact with or even build their own animations may prove to be more effective than passive viewing. Petre, Blackwell and Green (1997) raise a number of related open questions regarding the cognitive aspects and benefits of software visualization. In summary, the question of whether diagrams are useful for instruction may be refined to questions such as: Are students able to more easily understand material from diagrams? Are diagrammatic representations equally useful to all students? Do animated diagrams make conceptually complex material easy to grasp? In each case the answer appears not to be an unequivocal yes. Instead it is a qualified yes: provided the students have sufficient domain-specific knowledge of the subject and of the subject-to-diagram mapping to interpret diagrams correctly, provided the students’ cognitive style is visually oriented, and provided the students already have enough understanding of the basics to exploit information presented by the animation. These studies covered both static and dynamic diagrams, as well as pure diagrams and mixed diagrammatic and sentential representations. The general conclusion ought to be that unless a student is equipped to carry out appropriate interpretation and comprehension processes when given a diagram that represents information about a scientific or technical domain, the diagram’s potential as an instructional resource may be in question regardless of whether it is static or dynamic, or used by itself or in conjunction with other kinds of representations.A number of additional open questions regarding diagrammatic representations in education are raised in (Brna, Cox & Good, 1997). Historical investigations of diagram use Historical and anecdotal evidence provide plenty of pointers to many famous scientists, including Galileo and Tesla, having relied upon mental imagery and external diagrammatic representations on their way to major discoveries and inventions (Nersessian, 1995; West, 1991). Cheng (1996a) proposes that a formal diagrammatic representational system called law encoding diagrams characterizes many diagrams constructed by scientists. The historical records left by some of these scientists have been extensively studied. Nersessian (1997) has investigated intermediate representations, including diagrams, that Maxwell employed. The work of Faraday has been analyzed by Gooding (1996) and others (Gooding & James, 1985). Because one has to rely almost exclusively on historical records such as scientist’s notes, this remains a difficult subarea that has seen relatively little research. 4.3 Computational Research Research on diagram-related computational processes may be classified in terms of work in computer graphics that pertains to the creation and manipulation of various kinds of two and three dimensional diagrams and animations, the extensive research on parsing, interpreting, compiling and executing visual programming languages, work in human-computer interaction and direct manipulation interfaces that deal with diagrammatic (or iconic) interfaces, and work in artificial intelligence on systems that can understand and reason with various kinds of diagrammatic representations. This provides us with another part of the taxonomy, as shown in Figure 10. Computer graphics and animation is an independent and vibrant research area in its own right, so I exclude it from this chapter. 21 Generating, manipulating and animating diagrams Visual programming languages Computational aspects Diagrammatic interaction and interfaces Intelligent diagrammatic reasoning systems Figure 10. Secondary Levels of a Taxonomy for Diagrammatic Communication Research Visual programming languages Diagrams have always been used in computer science to represent data structures, the flow of control in programs, and the flow of data through program components. A variety of diagrammatic representational systems have been proposed for these purposes. Of theses diagram types, flowcharts are perhaps the best known example; but other kinds of diagrams such as state transition diagrams, data flow diagrams, petri net diagrams, etc., can also be easily found in most computer science textbooks. Theses diagrams have generally been used to improve human-human communication, for example, in instruction. Researchers investigating visual programming languages want to take such diagrammatic representations one step further, to serve as visual expressions using which one communicates with and develops programs for the computer. In other words, the aim is to develop human-computer diagrammatic communication in service of the programming activity. Since this is a subarea that has been independently developing for more than a decade, I will only provide pointers here. The interested reader may start with two recent comprehensive surveys of the field (Marriott, Meyer & Wittenberg, 1997; Narayanan & Hübscher, 1997) and move on to a two volume tutorial (Glinert, 1990) for details. Two excellent sources of current information are a series of annual conferences devoted to the topic (IEEE, serial) and a journal (JVLC, serial). Diagrammatic Interaction and interfaces Using diagrammatic representations to facilitate interaction and communication has been a predominant theme in the research on human-computer interaction (HCI) and graphical user interfaces (GUIs). The paradigm of representing states of a program or a computer (e.g., its file structure) graphically using diagrammatic representations called icons and allowing the user to communicate operations to the computer and the computer to communicate results of the operations to the user by manipulating these icons originated with a program called Pygmalion (Smith, 1977). The evolution of this idea into the notion of direct manipulation (Shneiderman, 1983) has literally revolutionized the development of user-friendly interfaces. Direct manipulation allows users to execute actions by directly interacting with visually displayed objects instead of having to describe the action. For instance, marking a file for deletion in a graphical user interface can often be done by dragging the file’s icon into an icon depicting a trash can. This executes the command (move <file> trash-folder). With direct manipulation 22 users can “depict” the operations they desire; without it they would have to describe the command in text to an interpreter that translated and executed the command. The cognitive benefits of direct manipulation arise from the elimination of the need to move between two vastly different representations: the natural diagrammatic representation (for example, the metaphorical desktop employed these days by almost all PCs) which permits interactive gestures and the underlying textual program inside the computer which specifies the objects and behaviors visible on the screen. Such interfaces are useful and usable because they have the WYSIWYG property. WYSIWYG is an abbreviation of “what you see is what you get” - a common phrase used to describe many graphical user interfaces. This phrase arose from the fact that many GUIs are designed to show exactly what the user can expect from an action. Printing a document will result in a paper version that looks exactly like how the document appeared on the screen of the word processor. To print a file one might click on its icon - a diagrammatic representation - and drag it into a visible printer icon on the computer screen - another diagrammatic representation. These kinds of graphical interfaces exemplify the intuitive appeal diagrammatic representations hold: these explicitly depict what is being represented. Unlike textual labels that bear no visual resemblance to the object being described, icons look like the real thing and so icon manipulation is much more intuitive. In fact, an intuitively compelling reason for the widely prevalent use of diagrams for representing and conveying information in all areas is the WYSIWYG property. Shneiderman (1997), Mullet and Sano (1995), and Hix and Hartson (1993) are a few of the many excellent books available on HCI and GUI. Increasing popularity of interactive graphical simulations that combine notions of diagrammatic reasoning and direct manipulation is an emerging trend in HCI. These have become an important tool in education (CACM, 1996). Many user friendly tools for creating such simulations are becoming available. Some examples are KidSimTM from Apple Computer (Smith, Cypher & Spohrer, 1994), Star Logo from MIT Media Laboratory (Resnick, 1996), and Agentsheets from the University of Colorado (Repenning & Sumner, 1995). Such simulations employ direct manipulation techniques to diagrammatically represent (on the computer display) the model to be simulated, and allow the user to directly interact with it and produce animations depicting timevarying processes such as predator-prey relationships in an ecosystem. Since users can directly manipulate objects in the simulation without having to access their programmatic representations, the need to move between two vastly different representations - the natural dynamic visual representations of processes being simulated and the underlying textual program of the simulation is eliminated. Intelligent diagrammatic reasoning systems Since this subarea is well covered by many publications (e.g., Glasgow, Narayanan & Chandrasekaran, 1995; Kulpa, 1994; Narayanan, 1992), I will discuss only a small sample of systems here to provide a flavor of the work. To prove theorems in elementary Euclidean geometry, Gelernter’s geometry theorem proving machine (Gelernter, 1963; Gelernter, Hansen & Loveland, 1963) used a backward reasoning strategy of working from the goal to be proved toward the premises and axioms. During this process the program used geometry diagrams in two ways. One was to prune the search space by rejecting any subgoal that was not true in the diagram. Secondly, inference was shortened by assuming facts that are obviously true in the diagram. Thus, the geometry theorem proving machine used diagrams as a resource for constraining symbolic reasoning. Another way in which diagrams influence symbolic reasoning was discovered by Koedinger and Anderson (1990). They found that human experts are able to recognize patterns, which they call diagram configurations, in the diagrams of geometry proof problems, and these patterns cued relevant problem solving knowledge from memory and thereby reduced search. The geometry theorem proving machine was 23 one of the first computer programs that used diagrams intelligently to aid problem solving. Not surprisingly, geometry remains the most popular domain of application for intelligent diagrammatic reasoning systems (Lindsay, in press; McDougal & Hammond, 1993; Kim, 1989). Another popular domain has been qualitative spatial reasoning (Forbus, 1995). The earliest system in this domain was WHISPER (Funt, 1980), which could solve problems of motion and stability of variously shaped blocks in a two-dimensional world. This system is interesting in that, unlike most systems that directly process externally provided diagrammatic representations, it contains an artificial polar retina which can scan the input diagram, focus its attention on various parts of the diagram, read off information, and engage in visualization by manipulating the diagram. Though the system’s domain was somewhat simplistic, it served to explicate the potential power of diagrammatic reasoning. Narayanan and Chandrasekaran (1991) describes a system called DR that takes a different approach that is more congruent with the literature on mental imagery, that of activating a network of diagrammatic schemas called visual cases to find and apply matching cases, to the same problem. REDRAW (Tessler, Iwasaki & Law, 1995) is a system that takes as input diagrams that civil engineers typically draw to depict frame structures under load. The system uses the diagram to extract constraints that are then applied to symbolic knowledge to derive how the frame will deform. This knowledge is applied to the diagram to make suitable modifications. Yip (1991) reports on a program called KAM for the qualitative analysis of nonlinear systems, which incorporates in its repertoire a capability to reason about spatial and geometric aspects of the phase space diagrams of such systems. BEATRIX (Novak & Bulko, 1993) is the only computer program that can process multimodal input - diagram and text - reported in the literature. It accepts physics problems stated in the form of a diagram accompanied by a textual explanation, as is typically found in physics textbooks. It then co-parses the text and the diagram, resolving coreferences along the way, and constructs a unified internal representation of the problem in a form suitable for another computer program that then solves it. One of the most interesting recent diagrammatic systems is the Electronic Cocktail Napkin (Do, 1995; Gross, 1995; Gross, 1996), intended as “electronic paper” for the architect or designer. Its input devices are a digitizing tablet and a cordless pen. It can recognize geometric elements of simple sketches users draw on the tablet. It can also be trained to recognize idiosyncratic personal symbols. The system can perform visual database searches, recognize and automatically maintain spatial relations between elements in a sketch (thereby relieving the designer of some tedium), and link designer’s sketches to a simulation environment. Several systems that employ techniques from computer graphics and computer vision as well as artificial intelligence to “understand” technical drawings in the engineering domain have been reported (Dori, 1992; Joseph & Pridmore, 1992; Vaxiviere & Tombre, 1992). Anderson and McCartney (1995;1997) address diagrammatic reasoning from a novel perspective that of reasoning with and learning from multiple diagrams. Using a simple representation for diagrams (a two-dimensional array of integers representing gray scale values of pixels) and compositions of simple diagrammatic operators, they are able to develop systems that operate in a variety of domains such as game playing, music notation, and weather prediction. 5 Conclusion In this section I highlight open issues for future research revealed by the research taxonomy presented earlier, discuss special characteristics of diagrammatic representations that enable these to support and enhance the cognitive abilities of people, and list a number of information resources on the topic for the motivated reader to follow. 24 Though the taxonomy proposed in this chapter is a preliminary one containing only two levels of categories, it already reveals potential avenues for future research. While quite a bit of work on characterizing the syntax and semantics of static diagrams has been done, this remains an important open area of increasing importance for the case of dynamic diagrams, given the current explosion of interest in multimedia and animation. Only very few, from among the variety of diagrammatic representational systems in formal and informal use in various disciplines, have been analyzed in depth. It is clear that this is a large uncharted territory. Another area that deserves significant future effort is analyzing the formal properties of diagrammatic interfaces and human-computer interaction through them, with a view to providing a strong theoretical foundation for the design and evaluation of such interfaces. Again, there is very little current work on this issue. In terms of psychological research, diagrammatic comprehension and reasoning have been studied so far only for a few specific representational systems such as schematic cross sectionals and graphs. As in the case of formal analyses of such representational systems, there is a gap in our knowledge of cognitive processes involved in human diagrammatic communication using the variety of diagrammatic representational systems of different disciplines. A combination of formal and empirical investigations is required to answer questions such as whether there are general cognitive and computational processes involved in diagrammatic communication applicable across multiple representational systems. Another fascinating open topic is the role of diagram generation and manipulation in creative thinking and problem solving. Intuitively it appears that diagrams ought to aid learning. Indeed, it is hard to find textbooks in any discipline that are devoid of diagrams. On the other hand, as indicated in an earlier section, many recent studies paint, at best, a somewhat mixed picture. Are our intuitions about the cognitive benefits of static and dynamic diagrams wrong? Or is it that there is more to be learned about how to design good diagrams and animations to help novice students? A lot more research is needed before these questions can be definitively answered. Some help in this might come from studies by historians of science on how scientists have in the past used diagrams during the course of their investigations. While there are plenty of anecdotes about how diagrams and imagery might have played significant roles in critical insights of these people (e.g., a dream about a snake eating its own tail prompting Kekule’s discovery of the structure of the benzene ring), extensive studies are confined to only a few scientists. This is another open territory. As far as computational aspects of diagrammatic communication is concerned, one concern is the lack of communication and cross fertilization among the areas of computer graphics, visual programming languages and human-computer interaction, each of which has progressed relatively independently of each other. Also, considering that nearly thirty years have passed since Gelernter’s pioneering work was published, there is still relatively little research on intelligent diagrammatic reasoning and communication systems. The following characteristics of diagrammatic representations appear to be primarily responsible for why these afford effective comprehension and reasoning: • explicit representation of information via visually perceivable aspects, • spatially localized organization of related information, • visual cueing of relevant prior knowledge, • facilitation of mental animation, and • reduction of complexity through constraining and guiding reasoning. The explicitness of diagrammatic representations, especially when the spatio-visual relations used to encode information are visually analogous to the information being represented, facilitates comprehension. The availability of information to be “read out” from diagrams and other kinds of 25 pictures is probably the reason behind the saying “a picture is worth ten thousand words”. This point is also illustrated by the mental exercise of trying to translate a complex diagram, such as an artist’s sketch of a natural scene, into a set of sentences. It becomes immediately clear that information loss will occur in this depiction-to-description translation. Once the translation is done in which information is discarded or lost, the reverse translation cannot be done uniquely. Diagrammatic representations help reduce search for information because they permit related information to be spatially organized in proximity. Spatio-visual properties such as color or texture not only convey information but also draw the reasoner’s attention to relevant objects and properties. Larkin and Simon (1987) provide an analysis of how spatial adjacency and connectedness can be used to reduce the complexity of reasoning in solving physics problems when the problem descriptions are accompanied by diagrams. Scientific visualizations are a good example of how spatial organization together with spatio-visual properties such as color and density can be effectively used to convey considerable information in a concise manner. Besides encoding information, elements of a diagram act as cues that help retrieve relevant prior knowledge from memory. Thus, diagrammatic representations function not only as effective representations of information, but also as effective probes into memory that aid the reasoner in retrieving and applying relevant prior knowledge to the task at hand. This effect has been experimentally observed in both expert geometry problem solving and naive mechanical problem solving. When diagrams analogously represent entities of the world and their properties, one is able to transform the diagram mentally (mental animation) or externally (computer animation or sketching) in order to reason about dynamic processes in the world that change the represented entities and properties. This enables one to make inferences about the evolution of the represented world based on evolution of the diagram under constraints mirroring those that apply to the processes in the world. This has been illustrated by research on diagrammatic reasoning about geometry and about spatial behaviors of mechanical devices. A series of experiments on how people reason about mechanical devices from cross-sectional schematic diagrams have shown that diagrams guide the reasoning process along the lines of causal propagation in the operation of the device. People use an incremental reasoning strategy of predicting behaviors of individual components and propagating these to other components by exploiting spatial cues of adjacency and connectedness explicit in the diagram during mental animation. This strategy works most of the time because diagrams organize components spatially with component depictions reflecting their spatial organization; thus reasoning for inferring events in the operation of the device is constrained by the structure of the diagram. More generally, when the diagram of a problem organizes its elements in a way that corresponds to the internal structure of the problem, one can follow this structure to find a path to the solution. Even when given a descriptive representation of certain kinds of problems, it has been observed that people tend to mentally image diagrammatic representations in order to solve them (Huttenlocher, 1968). Thus for diagrammatic representations, the explicit availability of information, spatially localized organization of information, visual cueing of relevant prior knowledge, and support for constrained mental animation together serve to reduce the complexity of searching (in memory or in the diagram) for relevant information, and the complexity of reasoning. To see this for yourself, consider Figure 11 below. It shows two angles x and y formed by two parallel lines a and b and an intersecting line c. Given this description we know that the two angles are equal. Suppose you are now asked the to answer yes or no to the following questions: If line c is moved up some distance so that it still intersects lines a and b, will x and y be equal? If line c is rotated about an end point so that it still intersects lines a and b, will x and y be equal? 26 We can answer these quickly because of our ability to mentally simulate transformations of diagrams. Starting with an informationally equivalent sentential description of the situation in Figure 11 the same inferences can be made, but not as easily and directly. b a c x y Figure 11. A Mental Animation Problem Diagrams facilitate situated reasoning because these make serendipitous inferences, inferences by recognition, prediction by mental visualization, and cueing prior knowledge possible. As a first step toward developing a general cognitive theory of effective diagrammatic representations, Cheng (1996c) has proposed twelve ways in which diagrams can support human problem solving: showing spatial structure, capturing physical relations, showing physical assembly, delineating elements, displaying values, depicting states, depicting state spaces, encoding temporal aspects, abstracting process information, capturing laws, doing computations, and sequencing computations. All these nice properties, of course, do not imply that diagrams are the best representation for all tasks. Diagrams can mislead as well as lead. Investigating the significance of all these aspects in various contexts of diagrammatic communication - in different disciplines and using various kinds of diagrammatic representational systems - and developing guidelines for the design of “good” diagrams that exploit these properties are excellent avenues for future research. It must however be noted that the human visual system is quite sophisticated and attuned to the perceptual modality. For such information processors diagrammatic representations can be more efficient than sentential ones, but the opposite situation holds for processors that are attuned to propositional representations, such as computers. Thus, informationally equivalent representations can have different computational complexity depending on the operations performed on them and the nature of the underlying information processing architecture that performs the operations (Larkin & Simon, 1987). This probably explains why it has proved much harder to develop intelligent diagrammatic systems than intelligent symbolic systems. Additional resources. Though interest in diagrams goes back a long way, the relatively recent convergence of the disciplines of psychology, philosophy, linguistics and artificial intelligence under the umbrella of cognitive science, and society’s increasing reliance on multimedia information, have provided the necessary impetus for a resurgence of interest in the topic. There are several resources that the serious reader can now follow to learn more about recent research. Several books and monographs have been published (Allwein & Barwise, 1996; Glasgow, Narayanan & Chandrasekaran, 1995; Hammer, 1995; Shin, 1995). There is a world wide web site devoted to the topic (http://uhavax.hartford.edu/Diagrams) from which one can also access an electronic discussion list. One journal special issue (Narayanan, 1993) has been published, and five workshops exclusively devoted to the topic have been held during the last five years; proceedings of two are available in print (Damski & Narayanan, 1996; Narayanan, 1992) and that of another is accessible through the web (Blackwell, 1997). Besides, a number of conferences 27 (e.g., Annual Conferences of the Cognitive Science Society, International Joint Conferences on Artificial Intelligence, AAAI National Conferences on Artificial Intelligence, IEEE Annual Symposia on Visual Languages) contain papers and tracks on diagrammatic communication. Acknowledgments. I would like to thank Boicho Kokinov for his invitation to present a tutorial course on diagrammatic reasoning at the Third International Summer School in Cognitive Science, and the faculty and students of the Cognitive Science Department, New Bulgarian University for organizing and running the summer school. Financial support for attending the summer school was provided by the Open Society Foundation, Sofia; thanks go to Maria Popova. The preparation of this chapter was supported in part by grants from the Office of Naval Research (contract N0001496-11187) and the National Science Foundation (contract CDA-9616513). REFERENCES Akin, O. & Lin, C. (1995). Design protocol data and novel design decisions. Design Studies, 16(2), pp. 211-236. Allwein, G. & Barwise, J. (Eds.) (1996). Logical Reasoning with Diagrams, New York: Oxford University Press. Anderson, M. & McCartney, R. (1995). Inter-diagrammatic reasoning. Proc. 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Mountain View, CA: Morgan Kaufmann. Anderson, M. & McCartney, R. (1997). Learning from diagrams. Machine Graphics and Vision, in press. Arnheim, R. (1969). Visual Thinking, Berkeley, CA: University of California Press. Barwise, J. & Etchemendy, J. (1994). Hyperproof, Cambridge, England: Cambridge University Press. Bertin, J. (1981). Graphics and Graphic Information Processing, English translation by W. J . Berg and P. Scott, Berlin: Walter de Gruyter. Bertin, J. (1983). Semiology of Graphics: Diagrams, Networks, Maps, English translation by W. J. Berg, Madison, WI: University of Wisconsin Press. Biederman, I. (1987). Recognition-by-components. A theory of human image understanding. Psychological Review, 94, pp. 115-147. Blackwell, A. F. (1997). Proceedings of the Thinking with Diagrams Workshop, Portsmouth, England, URL http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html. Block, N. (1981). Imagery, Cambridge, MA: MIT Press. Bottoni, P., Costabile, M. F., Levialdi, S. & Mussio, P. (1997). Specification of visual languages as means for interaction. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin: Springer-Verlag. Brna, P., Cox, R. & Good, J. (1997). Learning to think and communicate with diagrams. Discussion paper prepared for the Thinking with Diagrams Workshop, Portsmouth, England, available at http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html. Brown, M. H. & Sedgewick, R. (1985). Techniques for algorithm animation. IEEE Software, 2(1), pp. 28-38. Byrne, M. D., Catrambone, R. & Stasko, J. T. (1996). Do algorithm animations aid learning? Technical Report GIT-GVU-96-18, GVU Center, Georgia Institute of Technology, Atlanta, GA. CACM, (1996). Special section on educational technology, Communications of the ACM, 39(4). Cheng, P. C.-H. (1996a). Scientific discovery with law encoding diagrams. Creativity Research Journal, 9(2/3), pp. 145-162. 28 Cheng, P. C.-H. (1996b). Law encoding diagrams for instructional systems. Journal of Artificial Intelligence in Education, 7(1), pp. 33-74. Cheng, P. C.-H. (1996c). Functional roles for the cognitive analysis of diagrams in problem solving. Proc. 18th Annual Conference of the Cognitive Science Society, Hillsdale, NJ: Lawrence Erlbaum, pp. 207-212. Cornoldi, C. & McDaniel, M. A. (Eds.) (1991). Imagery and Cognition, New York: SpringerVerlag. Cox, R., Stenning, K. & Oberlander, J. (1995). The effect of graphical and sentential logic teaching on spontaneous external representation. Cognitive Studies: Bulletin of the Japanese Cognitive Science Society, 2(4), pp. 56-75. Damski, J. & Narayanan, N. H. (Eds.) (1996). Proceedings of the AID’96 Workshop on Visual Representation, Reasoning and Interaction in Design, Key Center for Design Computing, University of Sydney, Sydney, Australia. Do, E. Y-L. (1995). What is in a diagram that a computer should understand. The Global Design Studio: Proc. 6th International Conference on Computer Aided Architectural Design Futures, Singapore: National University of Singapore, pp. 469-482. Do, E. Y-L. & Gross, M. D. (1996). Drawing as a means to design reasoning. In N. H . Narayanan & J. Damski, (Eds.), Proc. AID’96 Workshop on Visual Representation, Reasoning and Interaction in Design, Key Center for Design Computing, University of Sydney. Do, E. Y-L. & Gross, M. D. (1997). Thinking with diagrams in architectural design. Discussion paper prepared for the Thinking with Diagrams Workshop, Portsmouth, England, available at http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html. Dori, D. (1992). Dimensioning analysis: Toward automatic understanding of engineering drawings. Communications of the ACM, 35(10), pp. 92-103. Engelhardt, Y., Bruin, J., Janssen, T. & Scha, R. (1996). The visual grammar of information graphics. In N. H. Narayanan & J. Damski, (Eds.), Proc. AID’96 Workshop on Visual Representation, Reasoning and Interaction in Design, Key Center for Design Computing, University of Sydney. Finke, R. A. (1990). Creative Imagery: Discoveries and Inventions in Visualization, Cambridge, MA: MIT Press. Forbus, K. (1995). Qualitative spatial reasoning: Framework and frontiers. In J. Glasgow, N. H . Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 183-204. Funt, B. V. (1980). Problem solving with diagrammatic representations. Artificial Intelligence, 13, pp. 201-230. Gelernter, H. (1963). Realization of a geometry theorem proving machine. In E. A. Feigenbaum & J. Feldman, (Eds.), Computers and Thought, New York: McGraw Hill, pp. 134-152. Gelernter, H., Hansen, J. R. & Loveland, D. W. (1963). Empirical explorations of the geometry theorem proving machine. In E. A. Feigenbaum & J. Feldman, (Eds.), Computers and Thought, New York: McGraw Hill, pp. 153-163. Glasgow, J., Narayanan, N. H. & Chandrasekaran, B. (Eds.) (1995). Diagrammatic Reasoning: Cognitive and Computational Perspectives. Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press. Glinert, E. P. (1990). Visual Programming Environments, Vol. I: Paradigms and Systems, Vol. II: Applications and Issues, Los Alamitos, CA: IEEE Computer Society Press. Goel, V. (1995). Sketches of Thought, Cambridge, MA: MIT Press. 29 Goldschmidt, G. (1991). The dialectics of sketching. Creativity Research Journal, 4(2), pp. 123143. Gombrich, E. H. (1968). Art and Illusion: A Study in the Psychology of Pictorial Representations, London: Phaidon. Gooding, D. (1996). Scientific discovery as creative exploration: Faraday’s experiments. Creativity Research Journal, 9(2/3), pp. 189-205. Gooding, D. & James, F. J. L. (Eds.) (1985). Faraday Rediscovered: Essays on the Life and Work of Michael Faraday, 1791-1867, London: Macmillan. Goodman, N. (1969). Languages of Art: An Approach to a Theory of Symbols, London: Oxford University Press. Gross, M. D. (1995). Indexing visual databases of designs with diagrams. In A. Koutamanis, H . Timmermans, & I. Vermeulen (Eds.), Visual Databases in Architecture, Aldershot, UK: Avebury, pp. 1-14. Gross, M. (1996). The electronic cocktail napkin: Computer support for working with diagrams. Design Studies, 17(1), pp. 53-69. Gurr, C. A. (1997). On the isomorphism (or otherwise) of representations. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin: Springer-Verlag. Hammer, E. (1995). Logic and Visual Information. Studies in Logic, Language & Computation, Palo Alto, CA: CSLI Publications, Stanford University. Hegarty, M. (1992). Mental animation: Inferring motion from static displays of mechanical systems. Journal of Experimental Psychology: Learning, Memory & Cognition, 18(5), pp. 1084-1102. Hix, D. & Hartson, H. R. (1993). Developing User Interfaces: Ensuring Usability Through Product & Process, New York: John Wiley & Sons, Inc. Hübscher, R. (1997). Visual constraint rules. Journal of Visual Languages and Computing, in press. Huttenlocher, J. (1968). Constructing spatial images: A strategy in reasoning. Psychological Review, 75(6), pp. 550-560. IEEE. (serial). Proceedings of the IEEE Annual Symposium on Visual Languages, Los Alamitos, CA: IEEE Computer Society Press. Joseph, S. H. & Pridmore, T. P. (1992). Knowledge-directed interpretation of mechanical engineering drawings. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14(9), pp. 928-940. JVLC. (serial). Journal of Visual Languages and Computing, London: Academic Press. Kim, M. Y. (1989). Visual reasoning in geometry theorem proving. Proc. 11th International Joint Conference on Artificial Intelligence, Mountain View, CA: Morgan Kaufmann, pp. 16171622. Koedinger, K. R. & Anderson, J. R. (1990). Abstract planning and perceptual chunks: Elements of expertise in geometry. Cognitive Science, 14, pp. 511-550. Kosslyn, S. M. (1980). Image and Mind, Cambridge, MA: Harvard University Press. Kosslyn, S. M. (1981). The medium and the message in mental imagery: A Theory. Psychological Review, 88(1), pp. 46-66. Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Imagery Debate, Cambridge, MA: MIT Press. Kulpa, Z. (1994). Diagrammatic representation and reasoning. Machine Graphics and Vision, 3(1/2), pp. 77-103. Larkin, J. H. & Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words. Cognitive Science, 11, pp. 65-99. 30 Lawrence, A. W., Badre, A. M. & Stasko, J. T. (1994). Empirically evaluating the use of animations to teach algorithms. Proc. IEEE Symposium on Visual Languages, Los Alamitos, CA: IEEE Computer Society Press, pp. 48-54. Lindsay, R. K. (in press). Using diagrams to understand geometry. Computational Intelligence. Lohse, G. I., Biolsi, K., Walker, N. & Rueler, H. H. (1994). A classification of visual representations. Communications of the ACM, 37(12), pp. 36-49. Lowe, R. K. (1989). Search strategies and inference in the exploration of scientific diagrams. Educational Psychology, 9, pp. 27-44. Lowe, R. K. (1993). Constructing a mental representation from an abstract technical diagram. Learning and Instruction, 3, pp. 157-179. Lowe, R. K. (1994a). Selectivity in diagrams: Reading beyond the lines. Educational Psychology, 14, pp. 467-491. Lowe, R. K. (1994b). Diagram prediction and higher order structures in mental representation. Research in Science Education, 24, pp. 208-216. Lowe, R. K. (1996a). Background knowledge and the construction of a situational representation from a diagram. European Journal of Psychology of Education, 11, pp. 377-397. Lowe, R. K. (1996b). Interactive animated diagrams: What information is extracted? Proc. Using Complex Information Systems Symposium, University of Poitiers, Poitiers, France, pp. 4045. Marr, D. & Nishihara, H. K. (1978). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society, Vol. B 200, pp. 269-294. Marriott, K. & Meyer, B. (Eds.) (1997). Visual Language Theory, Berlin: Springer-Verlag. Marriott, K. Meyer, B. & Wittenberg, K. (1997). A survey of visual language specification and recognition. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin: SpringerVerlag. Mayer, R. E. & Sims, V. K. (1994). For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. Journal of Educational Psychology, 86, pp. 389-401. McDougal, T. F. & Hammond, K. J. (1993). Representing and using procedural knowledge to build geometry proofs. Proc. 11th National Conference on Artificial Intelligence, AAAI’94, Palo Alto, CA: AAAI Press. Mullet, K & Sano, D. (1995). Designing Visual Interfaces. SunSoft Press, Englewood Cliffs, NJ: Prentice Hall PTR. Narayanan, N. H. (Ed.) (1992). Proc. AAAI Spring Symposium on Reasoning with Diagrammatic Representations, AAAI Technical Report SS-92-02, Menlo Park, CA: AAAI Press. Narayanan, N. H. (Ed.) (1993). Special issue on computational imagery. Computational Intelligence, 9(4). Narayanan, N. H. & Chandrasekaran, B. (1991). Reasoning visually about spatial interactions. Proc. 12th International Joint Conference on Artificial Intelligence, Mountain View, CA: Morgan Kaufmann, pp. 360-365. Narayanan, N. H. & Hübscher, R. (1997). Visual language theory: Toward a human-computer interaction perspective. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin: Springer-Verlag. Narayanan, N. H. & Hegarty, M. (1997). On designing comprehensible interactive hypermedia manuals. Under review. Narayanan, N. H., Suwa, M. & Motoda, H. (1994a). How things appear to work: Predicting behaviors from device diagrams. Proc. 12th National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 1161-1167. 31 Narayanan, N. H., Suwa, M. & Motoda, H. (1994b). A study of diagrammatic reasoning from verbal and gestural protocols. Proc. 16th Annual Conference of the Cognitive Science Society, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 652-657. Narayanan, N. H., Suwa, M. & Motoda, H. (1995a). Diagram-based problem solving: The case of an impossible problem. Proc. 17th Annual Conference of the Cognitive Science Society, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 206-211. Narayanan, N. H., Suwa, M. & Motoda, H. (1995b). Behavior hypothesis from schematic diagrams. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 501 -534. Nersessian, N. (1995). How do scientists think? Capturing the dynamics of conceptual change in science. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 137-182. Nersessian, N. (1997). Abstraction via generic modeling in concept formation in science. In N . Cartwright and M. R. Jones (Eds.), Correcting the Model: Abstraction and Idealization in Science, Amsterdam: Editions Rodopi. Novak, G. S. & Bulko, W. C. (1993). Diagrams and text as computer input. Journal of Visual Languages and Computing, 4, pp. 161-175. Paivio, A. (1971). Imagery and Verbal Processes, New York: Holt, Rinehart and Winston. Petre, M., Blackwell, A. F. & Green, T. R. G. (1997). Cognitive questions in software visualization. In J. Stasko, J. Domingue, B. Price & M. Brown, (Eds.), Software Visualization: Programming as a Multi-Media Experience, Boston, MA: MIT Press, in press. Price, B. A., Baecker, R. M. & Small, I. S. (1993). A principled taxonomy of software visualization. Journal of Visual Languages and Computing, 4(3), pp. 211-266. Pylyshyn, Z. (1981). The imagery debate: analog media versus tacit knowledge. Psychological Review, 88(1), pp. 16-45. Qin, Y. & Simon, H. A. (1995). Imagery and mental models in problem solving. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 403-434. Repenning, A. (1995). Bending the rules: Steps toward semantically enriched graphical rewrite rules. Proc. IEEE Symposium on Visual Languages, Los Alamitos, CA: IEEE Computer Society Press, pp. 226-233. Repenning, A. & Sumner, T. (1995). Agentsheets: A medium for creating domain-oriented visual languages. IEEE Computer, 28, pp. 17-25. Resnick, M. (1996). Beyond the centralized mindset. Journal of the Learning Sciences, 5(1), pp. 1-22. Roskos-Ewoldsen, B., Intons-Peterson, M. & Anderson, R. E. (Eds.) (1993). Imagery, Creativity and Discovery: A Cognitive Perspective, Amsterdam: North Holland. Russell, B. (1923). Vagueness. In J. Slater (Ed.), Essays on Language, Mind, and Matter 19191926, The Collected Papers of Bertrand Russell, London: Unwin Hyman, pp. 145-154. Schwartz, D. L. & Black, J. B. (1996a). Analog imagery in mental model reasoning: Depictive models. Cognitive Psychology, 30, pp. 154-219. Schwartz, D. L. & Black, J. B. (1996b). Shuttling between depictive models and rules: Induction and fallback. Cognitive Science, 20(4), pp. 457-498. Schwartz, D. L. & Hegarty, M. (1996). Coordinating multiple representations for reasoning about mechanical devices. In P. Olivier, (Ed.). Cognitive and Computational Models of Spatial 32 Representation, AAAI Spring Symposia Technical Report SS-96-03, Menlo Park, CA: AAAI Press. Shah, P. (1995). Cognitive Processes in Graph Comprehension. Unpublished Doctoral Dissertation, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA. Shin, S -J. (1995). The Logical Status of Diagrams, Cambridge, England: Cambridge University Press. Shneiderman, B. (1983). Direct manipulation: A step beyond programming languages. IEEE Computer, 16(8), pp. 57-69. Shneiderman, B. (1997). Designing the User Interface: Strategies for Effective Human-Computer Interaction, Second Edition, Reading, MA: Addison-Wesley. Sloman, A. (1975). Afterthoughts on analogical representations. Reprinted in R. J. Brachman and H. J. Levesque (Eds.), Readings in Knowledge Representation, San Mateo, CA: Morgan Kaufmann, 1985, pp. 432-439. Smith, D. C. (1977). Pygmalion: A Computer Program to Model and Simulate Creative Thought, Boston, MA: Birkhauser. Smith, D. C., Cypher, A. & Spohrer, J. (1994). Kidsim: Programming agents without a programming language. Communications of the ACM, 37, pp. 54-68. Stasko, J. T., Badre, A. M. & Lewis, C. (1993). Do algorithm animations assist learning? An empirical study and analysis. Proc. INTERCHI’93 Conference on Human Factors in Computing Systems, New York: ACM Press, pp. 61-66. Stenning, K., Cox, R. & Oberlander, J. (1995). Contrasting the cognitive effects of graphical and sentential logic teaching: Reasoning, representation and individual differences. Language and Cognitive Processes, 10(3/4), pp. 333-354. Stenning, K. & Lemon, O. (1997). Diagrams and human reasoning: aligning logical and psychological perspectives. Discussion paper prepared for the Thinking with Diagrams Workshop, Portsmouth, England, available at http://www.mrcapu.cam.ac.uk/personal/alan.blackwell/Workshop.html. Stenning, K. & Oberlander, J. (1995). A cognitive theory of graphical and linguistic reasoning: Logic and implementation. Cognitive Science, 19, pp. 97-140. Suwa, M. & Tversky, B. (1997). What do architects and students perceive in their design sketches? A protocol analysis. Design Studies, 18(3), in press. Tessler, S., Iwasaki, Y. & Law, K. (1995). Qualitative structural analysis using diagrammatic reasoning. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 711-730. Tufte, E. R. (1983). The Visual Display of Quantitative Information, Graphics Press, Cheshire, CT. Tufte, E. R. (1990). Envisioning Information, Graphics Press, Cheshire, CT. Tufte, E. R. (1997). Visual Explanations, Graphics Press, Cheshire, CT. Tversky, B. (1995). Cognitive origins of graphic productions. In F. T. Marchese (Ed.), Understanding Images: Finding Meaning in Digital Imagery, Springer-Verlag, New York, pp. 29-53. Tye, M. (1991). The Imagery Debate, Cambridge, MA: MIT Press. van Dijk, T. A. & Kintsch, W. (1983). Strategies of Discourse Comprehension, New York: Academic Press. Van Sommers, P. (1984). Drawing and Cognition, Cambridge, England: Cambridge University Press. 33 Vaxiviere, P. & Tombre, K. (1992). Celesstin: CAD conversion of mechanical drawings. IEEE Computer, 25(7), pp. 46-54. Wang, D. (1995). Studies on the Formal Semantics of Pictures. Doctoral Dissertation, Institute for Logic, Language and Computation, University of Amsterdam. Wang, D. & Lee, J. R. & Zeevat, H. (1995). Reasoning with diagrammatic representations. In J . Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press, pp. 339-396. West, T. G. (1991). In the Mind’s Eye, Buffalo, NY: Prometheus Books. Yip, K. M. (1991). Understanding complex dynamics by visual and symbolic reasoning. Artificial Intelligence, 51, pp. 179-221. Yuille, J. C. (Ed.) (1983). Imagery, Memory and Cognition: Essays in Honor of Allan Paivio, Hillsdale, NJ: Lawrence Erlbaum. 34
© Copyright 2024 Paperzz