DIAGRAMMATIC COMMUNICATION: A TAXONOMIC OVERVIEW 1

DIAGRAMMATIC COMMUNICATION: A
TAXONOMIC OVERVIEW 1
N. Hari Narayanan2
Technical Report CSE97-06
December 2, 1997
Visual Information, Intelligence & Interaction Research Group
Department of Computer Science & Engineering
Auburn University
Alabama 36849-5347
USA
http://www.eng.auburn.edu/department/cse/research/vi3rg/vi3rg.html
1
Appears in B. Kokinov (Ed.), Perspectives on Cognitive Science, Volume 3, New Bulgarian University
Press, Sofia, Bulgaria, pp. 91-122.
2
Author to whom correspondence should be addressed. Email: [email protected]
DIAGRAMMATIC COMMUNICATION: A TAXONOMIC OVERVIEW
Abstract
This chapter discusses research on the role of diagrammatic representations in facilitating
communication. Diagrammatic communication is an overarching theme that encompasses not only
diagrammatic representation but also diagrammatic reasoning and diagrammatic interaction between
humans and computers. Interdisciplinary research on this topic has addressed issues of human
diagram comprehension and diagrammatic reasoning from historical and psychological
perspectives, as well as issues of diagram parsing, inference from diagrams and human-computer
interaction via diagrammatic interfaces from theoretical and computational perspectives. Recently
there has been a resurgence of interest in this topic in cognitive science. My aim in this chapter is to
provide a window on various strands of recent research. The main contribution of this chapter is a
taxonomic description of the multifaceted research on diagrammatic communication, categorized
broadly as theoretical, psychological, and computational. This taxonomy is meant to serve as an
introduction to those with an interest in the topic and who may be contemplating doing serious
work in the area. In the course of describing a sample of current work in the field, open issues and
fruitful directions for future research are also identified.
1 Introduction
Evidence for humankind’s use of diagrams to represent and communicate information dates back to
the cave drawings of prehistoric times. In fact, visual thinking (Arnheim, 1969; West, 1991) is
considered by many to be a central aspect of intelligence. Diagrammatic representations predate the
development of spoken and written language (Tversky, 1995). Eventually symbolic languages
became the predominant tool for conveying information, but the use of diagrams for information
representation is still widely prevalent in all areas of human intellectual endeavor. The role and
utility of diagrams, and their advantages and disadvantages compared to textual representations,
have fascinated scientists and philosophers from the time of Aristotle and Plato. Recently,
researchers working in the interdisciplinary field of cognitive science have started looking at this
issue from a variety of perspectives including computational and psychological. My aim in this
chapter is to provide a window on these recent inquiries. This chapter, arising from an introductory
tutorial on diagrammatic representations3 , is intended more for those with an interest in the topic
and who may be contemplating doing serious work in the area, than for someone who is already
familiar with the area and who may be looking for a complete treatment of the state-of-the-art.
Consequently, the chapter provides introductory descriptions of a sample of various strands of
current research on diagrammatic representation, reasoning and interaction. It is not a
comprehensive survey, and therefore it is quite possible that some research literature may have
been unintentionally omitted. But I hope the reader will come away with an appreciation of the
multifaceted research into roles that diagrams play in human intellectual endeavors.
The chapter is structured as follows. I begin by circumscribing the terms diagrams and
diagrammatic representations for the purposes of discussion here, since there is no commonly
accepted technical definition. Cognitive science research on diagrammatic communication is then
classified broadly as theoretical, psychological, or computational. After that, the discussion
3
A course presented at the Third International Summer School in Cognitive Science, New Bulgarian University,
Sofia, Bulgaria, July 21 - August 3, 1996.
2
plunges into developing subcategories of this taxonomy and describing details of a sample of
corresponding research efforts. In each case, besides an overview of research, a set of references
for further exploration are also provided. The concluding section serves to summarize the main
ideas and discuss open issues for future research.
2 The What and Why of Diagrams
What are diagrams? In other words, what characteristics distinguish diagrams from other forms of
representation? And why are they considered useful? One must address these questions before
ruminating on the utility of diagrams as representational tools, inference aids and facilitators of
human-computer interaction.
A common, if somewhat circular, definition is that diagrams are pictorial representations of
information. This is not entirely accurate since the word “pictorial” carries connotations of
similarity. Pictures typically depict objects in ways similar to their actual appearance. However,
diagrams can represent information in more abstract form than a photograph or a painting. Various
kinds of graphs and charts are good examples of this abstract form of diagrammatic representation.
I will use the terms “diagram” and “diagrammatic representation” interchangeably in this chapter
since all diagrams presumably represent something or the other. A first attempt at defining
diagrammatic representation might be to say that it means “a representation, internal or external, in
which topological, geometric or other visual properties are significant”. But this does not exclude
English sentences as qualifying for the diagrammatic status since symbols of the English alphabet
are distinguishable by virtue of their geometric properties. Similarly, photographs and video
qualify as well even though we do not normally think of these as diagrams. I want to include
maps, pictograms, different kinds of charts, engineering blueprints, architect’s sketches, etc. as
examples of diagrams, while excluding text, photographs, video, etc.
a
b
C AT
c
d
Figure 1. Diagrammatic Representations?
3
So, what is a diagram? What is not a diagram? We may think that we intuitively understand this
distinction. But careful thought will reveal that the distinction between what is diagrammatic, and
what is not, to be a slippery one. On the face of it, I am sure people will agree that Figure 1a
contains a word representing an animal, and that it is not a diagram. Figure 1b depicts a diagram
which is an icon representing a hazard warning, typically found on electrical appliances. Figure 1d
is another diagram which is an icon representing a service station selling petrol or diesel fuel. To
those who know Japanese or Chinese scripts, Figure 1c contains an alphabetic symbol
representing the concept of a confined (jailed) person while for those who do not know these
languages it may look like a diagram representing something.
I propose that for a representation to qualify as diagrammatic, it has to satisfy both of the following
properties:
1. It contains one or more elements with semantically significant spatio-visual4
properties or relations.
2. At least one element (or a spatio-visual property of an element or a spatio-visual
relation between elements) of the representation is not visually similar to the
corresponding object (or property of the object or relation between objects) of
the represented.
The first requirement rules out text since spatio-visual properties of elements of text (the alphabetic
symbols) serve only to differentiate them, but are not semantically significant - i.e., do not carry
meaning. The second requirement rules out completely visually similar representations such as
photographs and video. Under this definition, Figures 1a is not diagrammatic because it does not
satisfy 1, whereas Figures 1b, 1c and 1d are diagrammatic. One could furthermore argue that
Figures 1c and 1d are more diagrammatic than Figure 1b. More about this later. I will formalize
this definition in Section 3.1. Now one may define “diagrammatic reasoning” as the process of
comprehending and making inferences from diagrammatic representations, and “diagrammatic
interaction” as the process of using actions on diagrammatic representations for human-computer
communication.
What makes diagrammatic representations fascinating as a subject of study (at least for me) is that
while avoiding the complexity of those representations, due to their richness of detail, that are
completely visually similar to the represented (e.g., photographs), diagrammatic representations
still manage to utilize spatio-visual properties of constituent components in very non-trivial ways to
effectively encode and convey information, and aid reasoning. Researchers working in this area are
motivated by a common interest in how information can be represented, communicated, and used
by humans and computers in ways that take advantage of the spatio-visual properties of diagrams,
and how to characterize the syntax and semantics of diagrammatic representations.
Sentential representations capture and convey information by virtue of the semantics of their
constituent elements (e.g., words) and how these concatenate to form more complex structures
(e.g., sentences). The meaning of such a representation is dependent upon the meanings and
context of constituent elements, but not on their spatio-visual properties (such as location, color,
font or point size). Diagrammatic representations, on the other hand, can not only be constructed
out of meaningful and context-sensitive elements, but the spatio-visual properties and twodimensional configurations of the constituent elements of a diagram can also be exploited to encode
and convey information. Thus, by moving from sentential to diagrammatic representations, one
can exploit additional dimensions for representing information. This is the root of representational
leverage that diagrammatic representations provide; and the source of fascination that such
representations hold. It also opens up an entirely new realm of challenging issues of
4
The term spatio-visual is meant to include all visible properties such as color, texture, topological properties,
geometric properties, and other spatial properties.
4
comprehension and reasoning. How can these kinds of representations be constructed, used for
communication, comprehended by both humans and machines, and reasoned with? Research in
this area addresses several questions, such as:
X How can information be best represented using diagrams?
X How can diagrammatic representations be automatically generated?
X How do we comprehend and make inferences from diagrammatic representations?
X How can diagram understanding and diagrammatic reasoning be automated?
X How can diagrams be used for programming and interacting with computers?
One approach to answering these questions is to begin by investigating how we, humans, create
and use such representations. For example, one phenomenon that is quite commonplace in
occurrence, but has proved somewhat elusive to scientific analysis, is mental imagery. It is not
difficult for most of us to think of an occasion when we recalled an image in our mind in order to
make an inference - imagining the living room while visiting a furniture store to see if the color and
patterns of furniture will match the wallpaper is a typical example. What kinds of representations
and reasoning processes are the mind using for this? Is the experienced immediacy of imagery a
mere figment of one's imagination, or is there an underlying reality to it - a reality rooted in spatiovisual mental representations? There is a rich and colorful history of research by philosophers,
psychologists and computer scientists on this. Besides mental imagery, such human-centered
investigations may also include studying how scientists have created and used diagrams in the
course of their investigations leading to major discoveries and inventions, or experimentally
studying how people comprehend, reason with or create diagrams in the course of various problem
solving activities. With the advent and spread of computer graphics and multimedia, there has also
been research within artificial intelligence on how computers can be made to understand, reason
with and generate various kinds of diagrams, and within the field of human-computer interaction
on how diagrammatic representations can aid humans to communicate more effectively with
computers through diagrams. Yet another approach has been to investigate the theoretical status
and foundations of diagrammatic representations using tools of logic, algebra, information theory,
etc. There is a rich body of literature on this too. I will discuss these various approaches in the rest
of the chapter.
3 Which Diagrams?
One comes across all kinds of diagrams in all walks of life. There are the simplistic diagrams that
children draw, semantically rich sketches that architects draw, metrically accurate blueprints that
engineers draw, and information laden graphs and charts that scientists draw, to name just a few.
How can one make sense of, and bring some order to, this plethora of diagrammatic
representations? This chapter is too short to even attempt to provide a comprehensive answer.
Nevertheless, let us briefly consider some of the ways in which order can be imposed on diagram
types.
One way of classification is in terms of the spatial dimensions diagrams represent: 1D, 2D and 3D
diagrams. Another way is to look at whether the representation is static - a fixed diagram, or
dynamic - a changing diagram, often called an animation. A third way to categorize diagrams is
based on the intellectual disciplines in which they are used: architectural diagrams, mechanical
blueprints, weather maps, node-and-link diagrams of computer science, and so on. A fourth
approach is to differentiate diagrams based on an abstract-concrete axis. A graph plotting the
fluctuations in the Dow Jones Index over the last ten years will then fall on the abstract side,
whereas the stick figure of a human drawn by a child will fall on the concrete side; but it is more
abstract than the realistic sketch of a person drawn by an accomplished artist.
5
Various researchers have tried to characterize and categorize diagrams in various ways. In a
theoretical approach, Engelhardt and colleagues (1996) propose six basic syntactic operations that
use two-dimensional space in different ways to encode information, shown in the table below,
which they claim are sufficient to characterize the syntactic structure of most diagrammatic
representations. In other words, the internal structure of most diagrams can be de-constructed in
terms of these operations.
Syntactic operation
random arrangement
pathing
unordered slotting
ordered slotting
sliding
spatial mapping
Explanation
use space to separate different entities
use space to encode topology, typically using connector symbols such as
arrows
dividing space into separate areas and assigning visual elements to these
slots
same as above, with the ordering of slots being semantically significant
using metric properties of space to represent non-spatial information,
e.g., graphs
using spatio-visual properties to represent spatio-visual aspects
A complementary empirical approach at classification, this time based on the cognitive structure of
diagrammatic representations, is discussed by Lohse and colleagues (1994). They used the ratings
provided by sixteen subjects, using ten rating scales, of sixty sample diagrammatic representations
to arrive at the following classification:
Type
icons
graphs
tables
graphic tables
network charts
time charts
structure diagrams
process diagrams
maps
cartograms
pictures
Explanation
diagrams with a single intended interpretation, meant to stand as labels of
things
encode quantitative information using position and magnitude of geometric
objects
two dimensional arrangements of words, numbers, signs or their
combinations
like tables, but use spatio-visual properties such as shading to convey
additional information
use graphical entities to show relationships among components
use a spatio-visual property to encode temporal data
spatio-visual properties of these representation express spatio-visual
properties of represented objects
use graphical entities and their properties to express dynamic, continuous or
temporal relationships and processes
representations of physical geography
spatial maps that show quantitative data
realistic depictions of the represented
Is there a single taxonomy of diagrams that settles the questions of what is, and what is not, a
diagram, what kinds there are, and allows one to unambiguously categorize any given diagram?
This is an open question. While the syntactic approach appears to be a powerful one for classifying
diagrammatic representations, the claim about its wide applicability remains to be proven. The
empirical approach begins to answer the question of whether and how we cognitively classify
diagrammatic representations, but much more work with larger subject and sample populations
needs to be done to derive a comprehensive taxonomy.
6
The interested reader may consult the following works that provide deeper forays into this issue.
Gombrich (1968) discusses several psychological issues. Bertin (1981) describes graphics in
terms of variables of the plane (location, texture, color, orientation, etc.) and considers ways in
which information can be mapped to these variables. Goodman (1969) offers both a general
account of representational symbol systems and a theoretical framework for analyzing
diagrammatic representations in terms of syntactic and semantic criteria such as disjointness,
differentiation, density and repleteness.
4 Scope of Research on Diagrammatic Representations
Diagrammatic languages abound in human endeavors, and in some disciplines enjoy prominence
comparable to that of textual languages. Many research areas have developed their own
diagrammatic languages, for instance, mathematicians use commutative diagrams, physicists use
Feynman diagrams, and computer scientists describe data structures with boxes and arrows. Once
familiar with such a language, it becomes an extremely efficient tool for communication. While
diagrammatic languages are very useful for people, and to some degree are also used by
computers, it is important to note that these do not replace textual languages. Not even comic strips
get away completely without text. Indeed, what one finds is a spectrum spanning from pure text, to
text illustrated with diagrams, to diagrams annotated with text, and to purely diagrammatic
languages.
The broadest use of diagrammatic representations has been for communicating information, both
among humans and between humans and computers. A graph of a function makes its
characteristics much more explicit than a table of values of the same function even though
informationally both are equivalent. Printed textual descriptions are frequently illustrated with
pictures that serve to exemplify the ideas contained in the text, to provide different representations
of the same information, or to complement what the text describes. In these cases, diagrammatic
views of information allow the viewer to discover relations and characteristics that are often hidden
in a textual representation. The three books by Tufte (1983; 1990; 1997) provide an excellent
treatment of how to effectively use diagrammatic representations to communicate information.
Communicating an idea using a diagrammatic representation requires not only representing the idea
in a diagram by the communicator, but also comprehending the meaning of the diagram by the
receiver. In many cases, when the meaning is not explicit or obvious, this requires reasoning on
the part of the recipient. When the communication is between a human and a computer, it typically
involves what is called human-computer interaction - i.e., explicit actions taken by the human on
the diagrammatic representation displayed on the computer screen to communicate an operation to
the computer, and a manipulation of the diagrammatic representation by the computer to
communicate results of the operation. Diagrammatic communication thus requires generation,
comprehension, reasoning and interaction. Three entities are involved in communication - the
communicator, the recipient, and the diagrammatic representation. The communicator and the
recipient (these roles switch during a discourse) may be cognitive or computational agents.
Processes involved in the computational side are diagram parsing, diagram interpretation, program
execution, and diagram generation or manipulation to convey results of execution. Processes
involved in the cognitive side are diagram perception, comprehension, inference, and diagram
generation or manipulation to convey results of inference. Figure 2 shows a model of diagrammatic
communication between a computational and a cognitive agent. The utility of a diagrammatic
representation in this model rests on two criteria: its computational tractability and cognitive
effectiveness. A similar model applies to the case of human-human communication.
7
perception/comprehension
parsing/interpretation
inference
execution
creation/manipulation
Cognitive Agent
creation/manipulation
Diagrammatic
Representation
Computational
Agent
Figure 2. Diagrammatic Communication
Research in this area can be broadly classified based on the communication model above. At the
top level, research can be divided into three categories: (1) theoretical investigations of the nature of
diagrammatic representations, (2) investigations of the cognitive processes - perception,
comprehension, reasoning, generation and manipulation, and (3) investigations of the
computational processes - parsing, interpretation, compilation, execution, generation, and
manipulation. Thus, one can begin to taxonomize research on diagrammatic communication5 as
shown in Figure 3. The following subsections describe each category in more detail.
Nature of diagrammatic representations
Diagrammatic Communication Research
Psychological aspects of diagrammatic communication
Computational aspects of diagrammatic communication
Figure 3. Primary Levels of a Taxonomy for Diagrammatic Communication Research
5
As should be clear by now, the term diagrammatic communication includes representation, reasoning and
interaction.
8
Kulpa, in the only published survey of the field (1994), provides an introductory treatment of the
field’s origins, rationale and basic ideas, along with an extensive set of references covering a
number of related areas. This chapter complements that survey by emphasizing cognitive aspects
and including more recent material. Moreover, it aims to provide, for the first time, a research
taxonomy for the field. A well founded taxonomy can serve at least two purposes. It can provide a
concise picture of the central issues of a field by facilitating the characterization of existing work,
and spur new research by revealing areas that are sparsely covered and therefore ripe for research.
My aim is to propose a preliminary taxonomy and open it to further discussion and elaboration. In
fact, the most important feature of any taxonomy is that it will be extended and revised as the field
progresses. Such a taxonomy is also a useful for providing a shared vocabulary for discussions
and relative comparisons of various research efforts in the area. All of these can contribute to not
only developing a better understanding of where we are in terms of current research, but also
where we ought to be going.
4.1 Theoretical Research
Research on the fundamental nature of diagrammatic representations includes efforts to define and
differentiate such representations from other representations, characterizing the syntax and
semantics of diagrammatic representations - how these encode information using graphical
primitives and their spatio-visual properties and the relations that hold between the representations
and the represented, analyzing formal properties of specific diagrammatic representational systems,
and formalizing human-computer interaction through diagrams. Figure 4 shows the corresponding
secondary levels of the research taxonomy.
Defining and differentiating diagrammatic representations
Characterizing diagrammatic representations
Nature of diagrammatic representations
Analyzing diagrammatic representational systems
Formalizing diagrammatic interactions
Figure 4. Secondary Levels of a Taxonomy for Diagrammatic Communication Research
Defining and differentiating diagrammatic representations
The discussion in Sections 2 and 3 has already touched upon this issue. While all of us presumably
have an intuitive idea of what a diagram is, precisely defining what is and what is not a diagram
turns out to be difficult since there are a wide variety of two-dimensional representations in use that
are considered to be diagrams. One may define diagrammatic representations by specifying what is
not a diagrammatic representation: all representations on a two-dimensional medium (e.g., paper,
cathode ray tube) that are neither true depictions of their referents (like video and photographs are),
9
nor concatenations of abstract symbols belonging to some alphabet whose meaning derive solely
from individual of groups of symbols (like text is), may be called diagrammatic.
Other characterizations have been proposed. Russell (1923) captures a central distinction between
diagrammatic and sentential representations thus:
“There is a complication about language...namely that words which mean relations
are not themselves relations...a map...is superior to language, since the fact that
one place is to the west of another is represented by the fact that the corresponding
place on the map is to the left of the other; that is to say, a relation is represented by
a relation.”
Sloman (1975) characterizes analogical representations, a kind of diagrammatic representations, as
follows:
“If R is an analogical representation of T, then there must be parts of R representing
parts of T,... and it must be possible to specify some sort of correspondence,
possibly context-dependent, between properties or relations of parts of R and
properties and relations of parts of T.”
What is worthy of note in this definition is that it is precisely the nature of this correspondence that
differentiates different kinds of diagrammatic representations. Another definition is proposed by
Stenning and Lemon (1997):
“A diagrammatic representation is a planar structure in which representing tokens
are objects whose mutual spatial relations are directly interpreted as relations in the
target structure.”
The working definition I will use for this chapter is a formalization of the one presented in Section
2. A pictorial representation R consists of a set of graphical primitives P, and a set of spatio-visual
relations Vn defined over one or more graphical primitives. Spatio-visual properties of individual
primitives, such as position, orientation, size, shape, color, texture, etc., are considered as unary
relations. Thus, R = {P, Vn}. The key aspect of any representation is what is being represented. It
is typically a state S of the world consisting of a set of entities E, and a set of relations Rn defined
over one or more entities (again, attributes of individual entities are unary relations), i.e., S = { E,
Rn} and R represents S. R is a diagrammatic representation DR if
1. š p D P such that a spatio-visual relation in which p participates, vp D V n,
represents an entity or relationship in S,
2. š p D P or vp D Vn representing an entity or relationship in S, such that it is not
visually analogous to its referent.
The creator of a diagrammatic representation has available different kinds of graphical primitives
such as geometric elements, a variety of spatio-visual properties of individual primitives such as
shape, color, texture, location, etc., and a number of spatio-visual relations such as adjacency,
connectedness, etc., that can all be used to carry meaning. There are a number of studies that have
attempted to enumerate and classify visual representations (Bertin, 1983; Lohse et al, 1994), and
prescribe systematic ways encoding information using P and Vn (Bertin, 1981; Engelhardt et al,
1996). Nevertheless, a comprehensive categorization of diagrammatic representations and
enumeration of the various ways in which their spatio-visual properties and relations can be used to
encode information is still lacking. This, I believe, is a very fruitful area for future research.
Characterizing Diagrammatic Representations
Diagrammatic representations represent states of the world. In other words, a state of the world,
describable in terms of a set of entities and relations, can be mapped to a corresponding
diagrammatic representation consisting of graphical primitives and spatio-visual relations. The
domain, range and type of this mapping are important. It may be one-to-one, one-to-many, manyto-one or many-to-many. One-to-one E C P and Rn C Vn mappings are common. For example,
algorithm animation systems that show the inner workings of algorithms using animated diagrams
10
(Brown & Sedgewick, 1985) map data items to geometric primitives such as circles, size of data
items to areas of the primitives, and relative positions of data items in a data structure to relative
spatial locations of the corresponding graphical primitives. Other kinds of mappings may be
difficult to comprehend without training unless based on established conventions. Venn diagrams
represent a many-to-one mapping since a single circle represents a set of many elements in the
world. Network and tree diagrams use a Rn C P mapping: the “connected” relation is represented
by a graphical primitive, typically a line.
Properties of different kinds of S C DR mapping are examined by Gurr (1997). This work
provides a very promising foundation for characterizing diagrammatic representations based on the
different kinds of correspondences between parts and relations of the representation and the
represented that Sloman (1975) alluded to. When one is diagrammatically representing states of a
world, there are two mappings of interest: world-to-representation and representation-to-world.
Consider the first mapping in which states S = { E, Rn} of the world are mapped to diagrammatic
representations DR = {P, Vn}. This mapping O is homomorphic iff ™ RDRn ™ <e 1, ... ,en>DEn:
<e 1, ... ,en>DR iff <O(e 1)DP, ..., O(en)DP> D (O(R)DVn). For example, consider the world of
integers with the binary less-than relation defined over them. We map this world to diagrammatic
representations in which integers are represented by themselves and the binary less-than relation is
mapped to arrows. Then S = {{1,2,3}, {{(1,2),(2,3),(1,3)}}} will be mapped to DR = {{1,2,3,
A}, q} with the constraint that any two integers will be connected by an arrow going from the
smaller to the larger, as shown in Figure 5. This is a homomorphic mapping. The mapping O is
one-to-one if (™ e1,e2DE: (O(e1)=p1 and O(e2)=p1) ‰ e1= e2) and (™ r1,r2D Rn: (O(r1)=v1 and
O(r2)=v1) ‰ r1= r2). In other words, distinct entities and relations of the world are not mapped to
the same graphical primitive or relation, or every graphical primitive or relation in the DR
represents at most one entity or relation in the world. The mapping of integers above is one-to-one.
The mapping O is onto if (™ pDP, š eDE: O(e)=p) and (™ vD V n, š rD Rn: O(v)=r). In other
words, if every graphical primitive and visual relation in the DR represents an entity or relation in
the world, the mapping is onto. The mapping in the previous example is onto. The mapping O is
isomorphic if it is homomorphic, one-to-one, and onto. The previous example therefore illustrates
an isomorphic world-to-representation mapping R n. These same notions apply to the reverse
mapping. That is, a representation-to-world mapping may be homomorphic, one-to-one, onto, or
isomorphic. The representation-to-world mapping of the DR in Figure 5a is also isomorphic. Gurr
(1997) calls a DR lucid if the corresponding SADR mapping is one-to-one, sound if this mapping
is onto, laconic if the corresponding DRAS mapping is one-to-one, and complete if this mapping
is onto.
2
3
1
Figure 5. An isomorphic DR
A different approach is proposed by Wang and colleagues (Wang, Lee & Zeevat, 1995). They
propose a theoretical construct called a signature morphism, which is a formally derived one-to-one
mapping between the signature of a DR and a state of the world. A signature consists of a set of
types with a partial order, a set of functions including instances of types and their attributes, and a
11
set of predicates that specifies relations between instances. The signature morphism is an {S C
DR: one-to-one E C P and Rn C Vn} mapping.
Another interesting characterization of DRs can be derived from the theoretical framework provided
by Goodman (1969). He defines a character class as an equivalence class of inscriptions.
Compound characters are permitted, so it is possible to view one DR as a character belonging to a
class. A compliant class is an equivalence class of entities in the world whose members are
denoted by members of some character class. It may be said that a character class represents its
compliant class. A language then is a set of character classes and their associated compliance
classes. The following five properties are required of languages that are notational systems:
1. For any inscription belonging to the language, it must belong to at most one character class.
2. There must be some finite difference between inscriptions belonging to different character
classes.
3. All inscriptions of a character denote the same compliance class.
4. No two different character classes may denote the same compliance class.
5. There must be some finite difference between different compliance classes.
The DR in Figure 5 is a notation. Many diagrammatic representational systems in common use
(e.g., many kinds of schematics) can be seen to be notational systems. Analog systems are nonnotational languages that violate properties 2 and 5. Neither the character classes (syntax) nor the
compliance classes (semantics) are finitely distinguishable. Analog systems are syntactically and
semantically dense. Syntactic density implies that it is theoretically possible to find another
character class between any two character classes, and semantic density implies that it is possible to
find another compliance classe between any two compliance classes. Maps drawn accurately to
scale (and which therefore permit extrapolation) are an example of an analog diagrammatic
representational system. Consider such a map in which a red dot indicates your current position. If
this map is a character that denotes your current position in the world, and if a different placement
of the dot on the map is another character denoting your position in the world in the past, then it is
possible to find a third character in which the red dot is somewhere between the previous two
locations and which denotes an intermediate position of yours in the world. The notions of
notational and analog diagrammatic systems serve to characterize diagrammatic representations.
The implications of such a characterization for diagrammatic communication (e.g., are analog
systems easier to comprehend? are notational systems easier for computers to deal with? etc.)
provide an excellent open area of research.
Sometimes diagrams represent dynamic processes in the world as well. These result in changes of
state, i.e., creation, deletion or modification of entities - 6E and of relationships - 6R n, i.e., 6S =
{6E, 6R n}. When static diagrams are used to represent state changes, different disciplines use
specific graphic symbols, agreed upon by convention, to represent the change. The use of arrows
to depict motion is an example. Static diagrams are not the only means for representation of
change. One can encode and convey information in dynamic diagrams as well. Dynamic diagrams,
or animations, have been effectively used for a long time by cartoon movie makers to tell stories.
With the availability of computer graphics techniques on personal computers, creating animations
has become much more widely accessible. How can one characterize the semantics of such
dynamic DRs?
The dynamic syntax of a DR specifies how it may be transformed. Such transformations may
include creating, deleting or modifying graphical primitives - 6P and changing the attributes of
graphical primitives and changing the spatio-visual relations between graphical primitives - 6V n, in
a DR. The conditions and constraints under which state transitions occur in the world that is being
represented, and how these affect the objects, their attributes and relations, may be mapped to
conditions and constraints under which a DR can be transformed to another and the nature of this
12
transformation. In other words, 6E and 6R n can be mapped to 6P and 6V n. When the world
being represented is continuously changing, this mapping requires quantization of the continuous
state changes so that a continuous change can be represented by a set of discrete DRs. An example
is the time display employed by digital watches that simulate analog dials using liquid crystal
displays so that the second hand has only 60 meaningful locations on the dial.
The domain, range and type of this mapping are important. It may be one-to-one, one-to-many,
many-to-one or many-to-many. The dynamic semantics of the DR is captured by this mapping.
Typically this semantics is implicitly captured in the graphical procedures or rewrite rules employed
by the system doing the animation. Repenning (1995) deviates from this practice by considering
how to extend rewrite rules to explicitly capture dynamic semantics. The previously discussed
theoretical notions seem to be applicable to this mapping of dynamics as well. But there is at
present very little research on the dynamic syntax and semantics of DRs.
While typically 6E is mapped to 6P and 6R n is mapped to 6V n, other mappings are possible.
There is no a priori reason for mapping static and dynamic aspects of the world to static and
dynamic aspects of the DR respectively. The static syntax of a DR consists of the graphical
primitives and spatio-visual relations. Its dynamic syntax specifies how DRs may be transformed.
This describes the representations. The represented, the world, includes both states of the domain
in terms of entities and relations, its static semantics, and state transitions, its dynamic semantics.
The central issue for the construction of a DR then is how to represent the static and dynamic
semantics of the world using the static and dynamic syntax of the DR. There are four possible
mappings: (1) S C DR, (2) 6SC 6DR, (3) S C 6DR, and (4) 6S C DR. (1) and (2) are
commonly occurring mappings. For example, the CARTOONIST program (Hübscher, 1997)
represents a microworld consisting of moving balls and stationary walls by mapping balls and
walls to circles and rectangles on a computer display, and mapping the motions of balls to the
corresponding motions of circles on the display. One example of (3) is the use of blinking or
flashing graphical objects in graphical user interfaces to attract the user’s attention to some state of
the system. An example of (4) is the use of arrows, static graphical primitives, to denote world
dynamics such as motions or forces. Each of the above four possibilities include 4 possible
mappings, since S, 6S, DR, and 6DR are describable by two sets of items each. Each of these
mapping may in turn be one-to-one, one-to-many, many-to-one or many-to-many, providing a
total of 64 possibilities. These analyses provide the beginnings of a theoretical foundation for
characterizing the static and dynamic syntax and semantics of diagrammatic representations. To
learn more about practical, psychological, and theoretical aspects of worldCDR mappings, consult
the following: Bertin (1981), Gombrich (1968), Marriott & Meyer (1997), Tufte (1983; 1990;
1997), and Wang (1995).
Analyzing diagrammatic representational systems
Most areas of human intellectual inquiry have developed their own diagrammatic notations with
corresponding diagramming conventions. Schematic diagrams used in mechanical engineering,
structural diagrams that civil engineers use, diagrammatic forms of the periodic system and
structures of chemical elements, Feynman diagrams used in physics, node-and-link diagrams
popular in computer science, and weather maps are but a few examples of the multitude of such
diagrammatic representational systems in use. Scientists’ use of law encoding diagrams, a formal
diagrammatic representational system useful for encoding laws or principles of a domain using
diagrammatic structures, have been investigated (Cheng, 1996a), and this representational system
has been used in instruction (Cheng, 1996b). Euler’s system of using circles to represent inclusion
of elements in sets, and its role in syllogistic reasoning, is discussed in (Stenning & Oberlander,
1995). A similar notation, Venn diagrams, is thoroughly analyzed by Shin (1995). Systematic
studies of such specific representational systems, aimed at deriving their formal properties and
13
uncovering their cognitive benefits (their adoption by a community indicates that they must aid
diagrammatic communication in some way or the other), are relatively few. Consequently, this is
another avenue for future research.
Formalizing diagrammatic interaction
Bottoni and colleagues (Bottoni et al., 1997) have begun to formalize the processes of humans
interacting with a computer system through a graphical interface. Their theoretical characterization
is based on the communication perspective that Figure 2 illustrates. They consider each state of the
graphical user interface (appearing as an interactive diagram on the computer screen) as a
component of a sentence of a visual language. Each visual sentence is formally specified as a 4tuple: the image on the computer screen, a description of what the image means (i.e., a description
of its programmatic implication for the underlying computer system), an interpretation function
from image to description, and a materialization function in the reverse direction. Given this
theoretical framework, visual sentences are characterized in terms of whether components of every
image in every visual sentence can be interpreted in terms of programmatic components, whether
every programmatic component has an associated image component visible on the display, and
whether the user can interact with, and receive feedback from, every image component that is
visible. This leads to a class hierarchy of visual languages for interaction.
4.2 Psychological Research
Psychological research on diagrammatic communication spans a wide variety of investigations.
The cognitive processes involved in the use of DRs are those of perceiving the diagram,
comprehending its meaning, making inferences about the state of affairs depicted by it, and
manipulating the diagram or generating new ones to convey results of reasoning. Recent research
efforts have focused on developing cognitive models of diagram comprehension and diagrammatic
reasoning through experimental investigations of how people understand various kinds of
diagrams, sometimes in conjunction with other kinds of representations such as text, and make
inferences or solve problems. There are also studies of how people generate and manipulate
external diagrams in the course of problem solving activities. Another strand of research has
looked at educational issues and implications of using diagrams in textbooks and instruction. Then
there is historical research, primarily concerned with visual thinking and its reported role in the
history of scientific discoveries. Issues of diagram perception fall within psychological and
physiological research on visual perception, which is excluded from this chapter. Figure 6 shows
the corresponding secondary levels of the research taxonomy.
Mental imagery
Comprehending and reasoning with diagrammatic representations
Psychological aspects
Generating diagrammatic representations
Educational uses of diagrammatic representations
Historical investigations of diagram use
Figure 6. Secondary Levels of a Taxonomy for Diagrammatic Communication Research
14
Mental Imagery
Research on mental imagery provides one foundation for recent research into the psychological
aspects of diagrammatic reasoning with both external and internal diagrams. However, since it is
not directly related to diagrammatic communication, I will not discuss it further. There is an
abundance of resources that the interested reader can tap to learn more about the history of research
on mental imagery, particularly the famous imagery debate about whether analog or propositional
mechanisms underlie the phenomena of imagery (Block, 1981; Cornoldi & McDaniel, 1991;
Finke, 1990; Kosslyn, 1980; 1981; 1994; Paivio, 1971; Pylyshyn, 1981; Roskos-Iwoldsen, et al.,
1993; Tye, 1991; Yuille, 1983).
Comprehending and reasoning with diagrammatic representations
How does one comprehend a diagram? Diagrams are typically combined with textual explanations,
as most textbooks of scientific disciplines show. How does one understand such mixed-mode
descriptions? Researchers have examined these fundamental questions in the context of mechanical
device diagrams such as the one shown in Figure 7 (Hegarty, 1992).
Figure 7. A Pulley System
One model of diagrammatic comprehension and reasoning that has emerged from this research
(Narayanan & Hegarty, 1997) postulates that diagram comprehension is a constructive process in
which the individual attempts to use his or her prior knowledge of the domain, information
presented in the diagram, and his or her reasoning skills to build a mental model of the situation or
artifact described in the presented materials. It can be seen as an extension of models of text
15
processing that view comprehension as the construction of a mental model of the referent of the text
(e.g., van Dijk & Kintsch, 1983). According to this model comprehending and reasoning with
mechanical device diagrams, possibly with accompanying text, involves the following stages (not
necessarily occurring in the given order) - decomposition, recomposition, determination of activity
propagation paths, and dynamization. Details of this model are shown in Figure 8 and explained
below.
Decomposition of the Device's
Diagram
Construction of a Static Mental Model
Making Representational
Connections
Making Referen tial
Co nnections
- between representations of
visual elements and prior
knowledge about components
de picted by the elements
- between verbal and visual e lements
in external displays with the same
referent
- between representations of
different components
- between visual elements in e xternal
displays with the same referent
- between elements in external
displays and internal representations
of their referents
Determination of Cau sal Activity
Propagation Paths in the Device
Construction of a Dynamic Mental Model
Men tal Animation of
Static Model
Rule-based Inference
of Component Behavio rs
Figure 8. A Model of Diagrammatic Comprehension and Reasoning
Decomposition. Decomposition involves parsing the diagram into its elementary units. Diagrams
of mechanical devices are made up of elementary shapes, such as rectangles, circles and cylinders,
which represent objects such as pistons, gears and tubes. The first step in comprehension is to
parse the connected diagram into these elementary shapes, i.e., units that correspond to objects.
This process is analogous to identifying discrete words and clauses in a continuous speech sound
and probably relies largely on perceptual mechanisms of object recognition (Biederman, 1987; Marr
& Nishihara, 1978).
Recomposition. This involves constructing a static mental model of the referent of the diagram
(e.g., the device that the diagram represents) by making appropriate representational and referential
16
connections in memory. There are two types of representational connections: connections to prior
knowledge and connections to the representations of other machine components (Mayer & Sims,
1994), and there are referential connections among elements of the external and internal
representations.
Connections to prior knowledge. F
one must recognize the components, that is, make
representational connections between the identified diagrammatic elements and prior knowledge
about their real-world referents - a process analogous to lexical access in language comprehension.
For example, one might represent that a rectangle stands for a piston. Prior knowledge can also
provide additional information about components, such as what these are typically made of and if
they are rigid or flexible. This information is valuable in making inferences about how components
move and constrain each other’s behaviors.
Connections to the representation of other machine components. Second, one must internally
represent the spatial relations (indicated in the diagram) between different device components by
building connections between the internal representations of these components. In understanding
how a device works, information about the spatial relations between mechanical components forms
a basis for inferences about the motions of components, because these spatial relations influence
how components affect and constrain each other’s motions. Knowledge of spatial relations also aids
in guiding the reasoning process along the chain of causality in the device (Hegarty, 1992;
Narayanan, Suwa & Motoda, 1994b).
Making referential connections. When diagrams are accompanied by text (as is usually the case) an
additional stage in comprehension is that of resolving coreference between the two media, i.e.,
making referential links between a noun phrase in the text (e.g., "the piston") and the diagrammatic
unit that depicts its referent (e.g., a rectangle) (Novak & Bulko, 1993). This step is crucial to
constructing an integrated representation of the common referent of the text and diagram in memory
as opposed to separate surface-level representations of the text and diagram. Making referential
connections is also a necessary process when viewers have to integrate information from multiple
diagrams of the same device (e.g., two schematic diagrams showing two different cross sections of
the same device) or to construct an internal 3-dimensional representation of the device from
diagrams showing different perspective views. Another kind of referential connections associate
elements of the external representations with the corresponding elements of one’s internal
representations.
Determining the causal propagation of activities in the device. The previous stages
help create a static understanding of the device that the diagram represents. However, if one is
asked to predict how such a device operates, it triggers diagrammatic reasoning to infer the device’s
dynamics and kinematics. It has been found that people tend to reason about a device’s operation,
using its diagram, along the paths of causal propagation in the device (Hegarty, 1992; Narayanan,
Suwa & Motoda, 1994b). Therefore, a stage of identifying the potential causal chains of events in
the operation of the device seems necessary for successful diagrammatic reasoning.
Dynamization. This is a word I coined to denote the process of converting the static mental model
constructed as a result of diagram comprehension into a dynamic one, i.e., incorporating inferences
about the operation of the device into the existing mental model. This is accomplished by inferring
and integrating the dynamic behaviors of individual components. This process involves both
mental visualization of the component behaviors and rule-based inference. Cognitive and
computational models (Hegarty, 1992; Narayanan, Suwa & Motoda, 1994a; 1994b) suggest that
this is an incremental process (as shown in Figure 9) in which the reasoner considers the
components or subsystems individually, assesses the influences acting on each, infers the resulting
behavior of each, and then proceeds to consider how this behavior affects the next component or
subsystem in the causal chain.
17
Select the most recent hypothesis about a
component's behavior from working memory
Retrieve prior knowledge, if available, about the component
and its hypothesized behavior
Scan the diagram to retrieve information about
spatial relations that exist between this and other
components
Generate new hypotheses
BY
rule-based inference applied
to prior knowledge and spatal
relations between components
internally visualizing the
component behavior
ABOUT
which other components are affected by this component
Add the new hypotheses to working memory
Figure 9. Diagrammatic Reasoning about Device Behaviors
Inferring the motion of each component can involve either rule-based reasoning processes that
utilize prior knowledge or mental animation processes that detect component interactions from a
mental visualization of component behaviors. In some cases, a rule of mechanical reasoning is
available because it has been explicitly learned, or generalized from a series of trials in which
simulation processes were used (Schwartz & Black, 1996a). An example of such a rule is that every
other gear in a gear chain turns in the same direction. In other cases, no such rules are available,
and inference of component behavior is based on a mental simulation of the component behaviors
(Narayanan, Suwa & Motoda, 1995b; Schwartz & Black, 1996b). As a result, constructing a
dynamic mental model of a device is best thought of as a hybrid reasoning process in which people
can either use rule-based or imagery-based inference processes, depending on their knowledge of
relevant rules of mechanical inference, and beliefs about the efficacy of the two types of reasoning
processes in a particular situation (Schwartz & Hegarty, 1996).
One implication of this model is that prior knowledge is critical to accurate diagrammatic reasoning.
Lowe (1994b) found that prior knowledge, such as the level of domain expertise and knowledge
about diagramming conventions, determined the difference between expert and novice performance
in a diagrammatic prediction task. It is not sufficient to have relevant knowledge, but one must be
able to retrieve and apply it appropriately during diagrammatic reasoning. Narayanan and
colleagues (Narayanan, Suwa & Motoda, 1994b) found that visual elements of a diagram help cue
and retrieve relevant prior knowledge from long term memory during diagrammatic reasoning.
18
Koedinger and Anderson (1990) propose that prior knowledge in the form of diagram patterns aids
experts in solving geometry proof problems stated in terms of diagrams.
This model suggests that the spatio-visual primitives and relations in a good diagrammatic
representation should be explicit and recognizable. There should not be elements that are not
relevant - i.e., distracting details should be omitted. In addition, configurations of primitives in a
diagram must be amenable to easy and unambiguous decomposition. The reverse mapping from P
and Vn of the DR to E and Rn in the world should be obvious or easily learnable. Furthermore, this
model indicates the potential cognitive utility of meaningful animations. Narayanan and Hegarty
(1997) describe a project aimed at developing guidelines based on this model for the design of
hypermedia information presentation systems in the domain of explaining how machines work.
This project takes research on diagrammatic reasoning into the domain of human computer
interaction by proposing how one can apply what is known about diagrammatic reasoning to the
design of interactive hypermedia information presentation systems. The model above applies to
concrete diagrams that depict components of the object being represented. So it is not applicable to
comprehension of abstract diagrams. Graphs are a particularly abstract type of diagram, and it has
been suggested that comprehending graphs is a serial and incremental process similar to text
comprehension (Shah, 1995).
Generating diagrammatic representations
Most of us have drawn doodles - freehand sketches - during the course of thinking about some
issue or solving a problem. For most designers, whether they be artists, architects or engineers,
these sketches are not mere idle creations. They are external manifestations of the cognitive
processes involved in synthesis, and play a significant role in the creative process. Research on
this facet of diagrammatic reasoning, namely, how drawing aids the creative process, is so far
mostly confined to studies of architects and graphic designers.
From a study of twelve professional designers, Goel (1995) concludes that the syntactic and
semantic density of freehand sketches helped designers in fluidly transforming elements of their
sketches and correspondingly facilitated transformations of their design ideas. In other words,
freehand sketching aids creative designing. Suwa and Tversky (1997) studied architects to find that
they perceive and think about three kinds of information during design sketching: spatial relations
among elements, emergent properties of elements and functional implications. Moreover, the
design process itself proceeded in cycles involving two distinct kinds of cognitive activity opportunistically shifting attention to a new design topic or consecutively exploring a series of
related issues. Expert architects were better able to exploit sketches for the second activity, thereby
developing more complex design ideas. Emergent properties of the sketches seem to have
facilitated focus shifts whereas spatial relations encouraged exploring related ideas. Goldschmidt
(1991) observes that “seeing as” and “seeing that” are two ways of inspecting sketches drawn
during design. The former is a powerful means of visual thinking for the designer to gain access
to related mental images that may potentially trigger new ideas in the design problem being solved.
Akin and Lin (1995) observe from an experimental study that novel design decisions most often
occurred when designers were engaged in three activities - drawing, thinking and examining. Do
and Gross (1996; 1997) discuss how diagrams, drawing, and inspecting sketches function as
vehicles for design reasoning in architecture.
For a children’s developmental perspective on drawing, see Van Sommers (1984). Qin and Simon
(1995) report a study in which diagrams drawn by subjects, along with other evidence, were used
to investigate how they formed and used mental images in the course of trying to understand
complex technical material on relativity. They found, similar to Akin and Lin, that forming images,
watching images and reasoning were the three main activities, and that the kind of images formed
had an influence on problem solving performance. The paucity of literature on the role of diagram
19
generation in domains other than architecture clearly indicates that this is an issue that deserves
future research attention.
Educational uses of diagrammatic representations
Are diagrams useful for instruction? Judging by the abundance of textbooks containing diagrams,
it seems that they certainly are. Diagrams are very commonly used in technical domains to
introduce students to key aspects. It is assumed that the selectivity and graphical presentation of
diagrams will provide instructional benefits. However, many recent studies, a sample of which I
review here, indicate that diagrammatic representations can neither be a panacea nor replace text
and other components of good instructional practice.
Weather maps are relatively simple diagrams that encode complex atmospheric processes. By
examining a weather map, professional meteorologists are able to predict how it will evolve as a
result of these processes. This is a sophisticated kind of diagrammatic reasoning due to the
complexity of the processes represented by the diagrams. Lowe has been investigating how novice
students make sense of weather maps compared to expert meteorologists (Lowe, 1989; 1993;
1994a; 1994b; 1996a). He discovered that students, lacking in domain-specific knowledge,
interpreted these diagrams based on low level graphical characteristics alone and their
interpretations were constrained by the diagram’s selectivity and were therefore spatially and
temporally impoverished. Experts, on the other hand, used rich domain knowledge to elaborate the
structures present in diagrams to construct interpretations incorporating the broader meteorological
situation implied by the diagram. When dynamic diagrams interactive animated weather maps were employed to counter the problem, mixed results were obtained (Lowe, 1996b). It was found
that the realistic appearance of dynamics shown by the animations resulted in novices’
inappropriate attributions of cause-effect relations to entities that were not so related.
Hyperproof (Barwise & Etchemendy, 1994) is a system that was designed to teach logic to
students by allowing students to construct logic proofs assisted by the computer. Its
representations consist of geometric diagrams and logic sentences. Its diagrams show
configurations of three-dimensional geometric objects. In studying how well it helps students learn
logic in comparison with traditional teaching methods, Stenning and colleagues (Cox, Stenning &
Oberlander, 1995; Stenning, Cox & Oberlander, 1995) found large individual differences between
students in how they respond to teaching in the graphical and sentential modalities, and those
differences extended to self-generated external representations produced during untutored problem
solving. Students classified as visualizers performed better than students classified as verbalizers
following a logic course taught using Hyperproof. Interestingly, visualizers did not show a
preference for the graphical modality over verbalizers. But they were more adept at strategically
switching between diagrammatic and sentential representations compared to verbalizers.
Researchers in software visualization in general (Price, Baecker & Small, 1993), and algorithm
animation in particular (Brown & Sedgewick, 1985), strive to create static and dynamic diagrams
for displaying the structure and operation of algorithms and software for the benefit of humans. In
the case of algorithm animations, the target audience is students. The implicit hope of early
designers of algorithm animation systems had been that these would prove to be beneficial
instructional systems that make it easy for students to learn what was otherwise conceptually
complex material. This hope was based on the assumption that animated diagrams, by providing
concrete visualizations of abstract mathematical operations, would automatically make those more
comprehensible. Recent research on the cognitive effectiveness of these animations has, however,
also unearthed mixed results (Byrne, Catrambone & Stasko, 1996; Lawrence, Badre & Stasko,
1994; Stasko, Badre & Lewis, 1993). The researchers concluded, as in the case of weather maps,
that students’ prior knowledge was an important determinant of the benefits they derived from an
animation. Too little of it hampered a student’s ability to understand the algorithm-to-diagram
mapping and use it to comprehend the operation of the algorithm; too much of it made the
20
animation redundant and uninteresting to a student. They conjecture that for novice students
comprehensive motivational instruction needs to accompany the animation, animation design needs
to be tied closely to instructional goals, and that allowing students to interact with or even build
their own animations may prove to be more effective than passive viewing. Petre, Blackwell and
Green (1997) raise a number of related open questions regarding the cognitive aspects and benefits
of software visualization.
In summary, the question of whether diagrams are useful for instruction may be refined to
questions such as:
Are students able to more easily understand material from diagrams?
Are diagrammatic representations equally useful to all students?
Do animated diagrams make conceptually complex material easy to grasp?
In each case the answer appears not to be an unequivocal yes. Instead it is a qualified yes: provided
the students have sufficient domain-specific knowledge of the subject and of the subject-to-diagram
mapping to interpret diagrams correctly, provided the students’ cognitive style is visually oriented,
and provided the students already have enough understanding of the basics to exploit information
presented by the animation. These studies covered both static and dynamic diagrams, as well as
pure diagrams and mixed diagrammatic and sentential representations. The general conclusion
ought to be that unless a student is equipped to carry out appropriate interpretation and
comprehension processes when given a diagram that represents information about a scientific or
technical domain, the diagram’s potential as an instructional resource may be in question regardless
of whether it is static or dynamic, or used by itself or in conjunction with other kinds of
representations.A number of additional open questions regarding diagrammatic representations in
education are raised in (Brna, Cox & Good, 1997).
Historical investigations of diagram use
Historical and anecdotal evidence provide plenty of pointers to many famous scientists, including
Galileo and Tesla, having relied upon mental imagery and external diagrammatic representations on
their way to major discoveries and inventions (Nersessian, 1995; West, 1991). Cheng (1996a)
proposes that a formal diagrammatic representational system called law encoding diagrams
characterizes many diagrams constructed by scientists. The historical records left by some of these
scientists have been extensively studied. Nersessian (1997) has investigated intermediate
representations, including diagrams, that Maxwell employed. The work of Faraday has been
analyzed by Gooding (1996) and others (Gooding & James, 1985). Because one has to rely almost
exclusively on historical records such as scientist’s notes, this remains a difficult subarea that has
seen relatively little research.
4.3 Computational Research
Research on diagram-related computational processes may be classified in terms of work in
computer graphics that pertains to the creation and manipulation of various kinds of two and three
dimensional diagrams and animations, the extensive research on parsing, interpreting, compiling
and executing visual programming languages, work in human-computer interaction and direct
manipulation interfaces that deal with diagrammatic (or iconic) interfaces, and work in artificial
intelligence on systems that can understand and reason with various kinds of diagrammatic
representations. This provides us with another part of the taxonomy, as shown in Figure 10.
Computer graphics and animation is an independent and vibrant research area in its own right, so I
exclude it from this chapter.
21
Generating, manipulating and animating diagrams
Visual programming languages
Computational aspects
Diagrammatic interaction and interfaces
Intelligent diagrammatic reasoning systems
Figure 10. Secondary Levels of a Taxonomy for Diagrammatic Communication Research
Visual programming languages
Diagrams have always been used in computer science to represent data structures, the flow of
control in programs, and the flow of data through program components. A variety of diagrammatic
representational systems have been proposed for these purposes. Of theses diagram types,
flowcharts are perhaps the best known example; but other kinds of diagrams such as state
transition diagrams, data flow diagrams, petri net diagrams, etc., can also be easily found in most
computer science textbooks. Theses diagrams have generally been used to improve human-human
communication, for example, in instruction. Researchers investigating visual programming
languages want to take such diagrammatic representations one step further, to serve as visual
expressions using which one communicates with and develops programs for the computer. In
other words, the aim is to develop human-computer diagrammatic communication in service of the
programming activity. Since this is a subarea that has been independently developing for more than
a decade, I will only provide pointers here. The interested reader may start with two recent
comprehensive surveys of the field (Marriott, Meyer & Wittenberg, 1997; Narayanan & Hübscher,
1997) and move on to a two volume tutorial (Glinert, 1990) for details. Two excellent sources of
current information are a series of annual conferences devoted to the topic (IEEE, serial) and a
journal (JVLC, serial).
Diagrammatic Interaction and interfaces
Using diagrammatic representations to facilitate interaction and communication has been a
predominant theme in the research on human-computer interaction (HCI) and graphical user
interfaces (GUIs). The paradigm of representing states of a program or a computer (e.g., its file
structure) graphically using diagrammatic representations called icons and allowing the user to
communicate operations to the computer and the computer to communicate results of the operations
to the user by manipulating these icons originated with a program called Pygmalion (Smith, 1977).
The evolution of this idea into the notion of direct manipulation (Shneiderman, 1983) has literally
revolutionized the development of user-friendly interfaces.
Direct manipulation allows users to execute actions by directly interacting with visually displayed
objects instead of having to describe the action. For instance, marking a file for deletion in a
graphical user interface can often be done by dragging the file’s icon into an icon depicting a trash
can. This executes the command (move <file> trash-folder). With direct manipulation
22
users can “depict” the operations they desire; without it they would have to describe the command
in text to an interpreter that translated and executed the command. The cognitive benefits of direct
manipulation arise from the elimination of the need to move between two vastly different
representations: the natural diagrammatic representation (for example, the metaphorical desktop
employed these days by almost all PCs) which permits interactive gestures and the underlying
textual program inside the computer which specifies the objects and behaviors visible on the
screen.
Such interfaces are useful and usable because they have the WYSIWYG property. WYSIWYG is
an abbreviation of “what you see is what you get” - a common phrase used to describe many
graphical user interfaces. This phrase arose from the fact that many GUIs are designed to show
exactly what the user can expect from an action. Printing a document will result in a paper version
that looks exactly like how the document appeared on the screen of the word processor. To print a
file one might click on its icon - a diagrammatic representation - and drag it into a visible printer
icon on the computer screen - another diagrammatic representation. These kinds of graphical
interfaces exemplify the intuitive appeal diagrammatic representations hold: these explicitly depict
what is being represented. Unlike textual labels that bear no visual resemblance to the object being
described, icons look like the real thing and so icon manipulation is much more intuitive. In fact,
an intuitively compelling reason for the widely prevalent use of diagrams for representing and
conveying information in all areas is the WYSIWYG property. Shneiderman (1997), Mullet and
Sano (1995), and Hix and Hartson (1993) are a few of the many excellent books available on HCI
and GUI.
Increasing popularity of interactive graphical simulations that combine notions of diagrammatic
reasoning and direct manipulation is an emerging trend in HCI. These have become an important
tool in education (CACM, 1996). Many user friendly tools for creating such simulations are
becoming available. Some examples are KidSimTM from Apple Computer (Smith, Cypher &
Spohrer, 1994), Star Logo from MIT Media Laboratory (Resnick, 1996), and Agentsheets from
the University of Colorado (Repenning & Sumner, 1995). Such simulations employ direct
manipulation techniques to diagrammatically represent (on the computer display) the model to be
simulated, and allow the user to directly interact with it and produce animations depicting timevarying processes such as predator-prey relationships in an ecosystem. Since users can directly
manipulate objects in the simulation without having to access their programmatic representations,
the need to move between two vastly different representations - the natural dynamic visual
representations of processes being simulated and the underlying textual program of the simulation is eliminated.
Intelligent diagrammatic reasoning systems
Since this subarea is well covered by many publications (e.g., Glasgow, Narayanan &
Chandrasekaran, 1995; Kulpa, 1994; Narayanan, 1992), I will discuss only a small sample of
systems here to provide a flavor of the work.
To prove theorems in elementary Euclidean geometry, Gelernter’s geometry theorem proving
machine (Gelernter, 1963; Gelernter, Hansen & Loveland, 1963) used a backward reasoning
strategy of working from the goal to be proved toward the premises and axioms. During this
process the program used geometry diagrams in two ways. One was to prune the search space by
rejecting any subgoal that was not true in the diagram. Secondly, inference was shortened by
assuming facts that are obviously true in the diagram. Thus, the geometry theorem proving
machine used diagrams as a resource for constraining symbolic reasoning. Another way in which
diagrams influence symbolic reasoning was discovered by Koedinger and Anderson (1990). They
found that human experts are able to recognize patterns, which they call diagram configurations, in
the diagrams of geometry proof problems, and these patterns cued relevant problem solving
knowledge from memory and thereby reduced search. The geometry theorem proving machine was
23
one of the first computer programs that used diagrams intelligently to aid problem solving. Not
surprisingly, geometry remains the most popular domain of application for intelligent diagrammatic
reasoning systems (Lindsay, in press; McDougal & Hammond, 1993; Kim, 1989).
Another popular domain has been qualitative spatial reasoning (Forbus, 1995). The earliest system
in this domain was WHISPER (Funt, 1980), which could solve problems of motion and stability
of variously shaped blocks in a two-dimensional world. This system is interesting in that, unlike
most systems that directly process externally provided diagrammatic representations, it contains an
artificial polar retina which can scan the input diagram, focus its attention on various parts of the
diagram, read off information, and engage in visualization by manipulating the diagram. Though
the system’s domain was somewhat simplistic, it served to explicate the potential power of
diagrammatic reasoning. Narayanan and Chandrasekaran (1991) describes a system called DR that
takes a different approach that is more congruent with the literature on mental imagery, that of
activating a network of diagrammatic schemas called visual cases to find and apply matching cases,
to the same problem. REDRAW (Tessler, Iwasaki & Law, 1995) is a system that takes as input
diagrams that civil engineers typically draw to depict frame structures under load. The system uses
the diagram to extract constraints that are then applied to symbolic knowledge to derive how the
frame will deform. This knowledge is applied to the diagram to make suitable modifications. Yip
(1991) reports on a program called KAM for the qualitative analysis of nonlinear systems, which
incorporates in its repertoire a capability to reason about spatial and geometric aspects of the phase
space diagrams of such systems.
BEATRIX (Novak & Bulko, 1993) is the only computer program that can process multimodal
input - diagram and text - reported in the literature. It accepts physics problems stated in the form
of a diagram accompanied by a textual explanation, as is typically found in physics textbooks. It
then co-parses the text and the diagram, resolving coreferences along the way, and constructs a
unified internal representation of the problem in a form suitable for another computer program that
then solves it.
One of the most interesting recent diagrammatic systems is the Electronic Cocktail Napkin (Do,
1995; Gross, 1995; Gross, 1996), intended as “electronic paper” for the architect or designer. Its
input devices are a digitizing tablet and a cordless pen. It can recognize geometric elements of
simple sketches users draw on the tablet. It can also be trained to recognize idiosyncratic personal
symbols. The system can perform visual database searches, recognize and automatically maintain
spatial relations between elements in a sketch (thereby relieving the designer of some tedium), and
link designer’s sketches to a simulation environment.
Several systems that employ techniques from computer graphics and computer vision as well as
artificial intelligence to “understand” technical drawings in the engineering domain have been
reported (Dori, 1992; Joseph & Pridmore, 1992; Vaxiviere & Tombre, 1992).
Anderson and McCartney (1995;1997) address diagrammatic reasoning from a novel perspective that of reasoning with and learning from multiple diagrams. Using a simple representation for
diagrams (a two-dimensional array of integers representing gray scale values of pixels) and
compositions of simple diagrammatic operators, they are able to develop systems that operate in a
variety of domains such as game playing, music notation, and weather prediction.
5 Conclusion
In this section I highlight open issues for future research revealed by the research taxonomy
presented earlier, discuss special characteristics of diagrammatic representations that enable these to
support and enhance the cognitive abilities of people, and list a number of information resources on
the topic for the motivated reader to follow.
24
Though the taxonomy proposed in this chapter is a preliminary one containing only two levels of
categories, it already reveals potential avenues for future research. While quite a bit of work on
characterizing the syntax and semantics of static diagrams has been done, this remains an important
open area of increasing importance for the case of dynamic diagrams, given the current explosion
of interest in multimedia and animation. Only very few, from among the variety of diagrammatic
representational systems in formal and informal use in various disciplines, have been analyzed in
depth. It is clear that this is a large uncharted territory. Another area that deserves significant future
effort is analyzing the formal properties of diagrammatic interfaces and human-computer interaction
through them, with a view to providing a strong theoretical foundation for the design and
evaluation of such interfaces. Again, there is very little current work on this issue.
In terms of psychological research, diagrammatic comprehension and reasoning have been studied
so far only for a few specific representational systems such as schematic cross sectionals and
graphs. As in the case of formal analyses of such representational systems, there is a gap in our
knowledge of cognitive processes involved in human diagrammatic communication using the
variety of diagrammatic representational systems of different disciplines. A combination of formal
and empirical investigations is required to answer questions such as whether there are general
cognitive and computational processes involved in diagrammatic communication applicable across
multiple representational systems. Another fascinating open topic is the role of diagram generation
and manipulation in creative thinking and problem solving.
Intuitively it appears that diagrams ought to aid learning. Indeed, it is hard to find textbooks in any
discipline that are devoid of diagrams. On the other hand, as indicated in an earlier section, many
recent studies paint, at best, a somewhat mixed picture. Are our intuitions about the cognitive
benefits of static and dynamic diagrams wrong? Or is it that there is more to be learned about how
to design good diagrams and animations to help novice students? A lot more research is needed
before these questions can be definitively answered. Some help in this might come from studies by
historians of science on how scientists have in the past used diagrams during the course of their
investigations. While there are plenty of anecdotes about how diagrams and imagery might have
played significant roles in critical insights of these people (e.g., a dream about a snake eating its
own tail prompting Kekule’s discovery of the structure of the benzene ring), extensive studies are
confined to only a few scientists. This is another open territory.
As far as computational aspects of diagrammatic communication is concerned, one concern is the
lack of communication and cross fertilization among the areas of computer graphics, visual
programming languages and human-computer interaction, each of which has progressed relatively
independently of each other. Also, considering that nearly thirty years have passed since
Gelernter’s pioneering work was published, there is still relatively little research on intelligent
diagrammatic reasoning and communication systems.
The following characteristics of diagrammatic representations appear to be primarily responsible
for why these afford effective comprehension and reasoning:
• explicit representation of information via visually perceivable aspects,
• spatially localized organization of related information,
• visual cueing of relevant prior knowledge,
• facilitation of mental animation, and
• reduction of complexity through constraining and guiding reasoning.
The explicitness of diagrammatic representations, especially when the spatio-visual relations used
to encode information are visually analogous to the information being represented, facilitates
comprehension. The availability of information to be “read out” from diagrams and other kinds of
25
pictures is probably the reason behind the saying “a picture is worth ten thousand words”. This
point is also illustrated by the mental exercise of trying to translate a complex diagram, such as an
artist’s sketch of a natural scene, into a set of sentences. It becomes immediately clear that
information loss will occur in this depiction-to-description translation. Once the translation is done
in which information is discarded or lost, the reverse translation cannot be done uniquely.
Diagrammatic representations help reduce search for information because they permit related
information to be spatially organized in proximity. Spatio-visual properties such as color or texture
not only convey information but also draw the reasoner’s attention to relevant objects and
properties. Larkin and Simon (1987) provide an analysis of how spatial adjacency and
connectedness can be used to reduce the complexity of reasoning in solving physics problems
when the problem descriptions are accompanied by diagrams. Scientific visualizations are a good
example of how spatial organization together with spatio-visual properties such as color and
density can be effectively used to convey considerable information in a concise manner.
Besides encoding information, elements of a diagram act as cues that help retrieve relevant prior
knowledge from memory. Thus, diagrammatic representations function not only as effective
representations of information, but also as effective probes into memory that aid the reasoner in
retrieving and applying relevant prior knowledge to the task at hand. This effect has been
experimentally observed in both expert geometry problem solving and naive mechanical problem
solving.
When diagrams analogously represent entities of the world and their properties, one is able to
transform the diagram mentally (mental animation) or externally (computer animation or sketching)
in order to reason about dynamic processes in the world that change the represented entities and
properties. This enables one to make inferences about the evolution of the represented world based
on evolution of the diagram under constraints mirroring those that apply to the processes in the
world. This has been illustrated by research on diagrammatic reasoning about geometry and about
spatial behaviors of mechanical devices.
A series of experiments on how people reason about mechanical devices from cross-sectional
schematic diagrams have shown that diagrams guide the reasoning process along the lines of causal
propagation in the operation of the device. People use an incremental reasoning strategy of
predicting behaviors of individual components and propagating these to other components by
exploiting spatial cues of adjacency and connectedness explicit in the diagram during mental
animation. This strategy works most of the time because diagrams organize components spatially
with component depictions reflecting their spatial organization; thus reasoning for inferring events
in the operation of the device is constrained by the structure of the diagram. More generally, when
the diagram of a problem organizes its elements in a way that corresponds to the internal structure
of the problem, one can follow this structure to find a path to the solution. Even when given a
descriptive representation of certain kinds of problems, it has been observed that people tend to
mentally image diagrammatic representations in order to solve them (Huttenlocher, 1968). Thus for
diagrammatic representations, the explicit availability of information, spatially localized
organization of information, visual cueing of relevant prior knowledge, and support for
constrained mental animation together serve to reduce the complexity of searching (in memory or in
the diagram) for relevant information, and the complexity of reasoning.
To see this for yourself, consider Figure 11 below. It shows two angles x and y formed by two
parallel lines a and b and an intersecting line c. Given this description we know that the two angles
are equal. Suppose you are now asked the to answer yes or no to the following questions:
If line c is moved up some distance so that it still intersects lines a and b, will x and y be equal?
If line c is rotated about an end point so that it still intersects lines a and b, will x and y be equal?
26
We can answer these quickly because of our ability to mentally simulate transformations of
diagrams. Starting with an informationally equivalent sentential description of the situation in
Figure 11 the same inferences can be made, but not as easily and directly.
b
a
c
x
y
Figure 11. A Mental Animation Problem
Diagrams facilitate situated reasoning because these make serendipitous inferences, inferences by
recognition, prediction by mental visualization, and cueing prior knowledge possible. As a first
step toward developing a general cognitive theory of effective diagrammatic representations, Cheng
(1996c) has proposed twelve ways in which diagrams can support human problem solving:
showing spatial structure, capturing physical relations, showing physical assembly, delineating
elements, displaying values, depicting states, depicting state spaces, encoding temporal aspects,
abstracting process information, capturing laws, doing computations, and sequencing
computations. All these nice properties, of course, do not imply that diagrams are the best
representation for all tasks. Diagrams can mislead as well as lead. Investigating the significance of
all these aspects in various contexts of diagrammatic communication - in different disciplines and
using various kinds of diagrammatic representational systems - and developing guidelines for the
design of “good” diagrams that exploit these properties are excellent avenues for future research.
It must however be noted that the human visual system is quite sophisticated and attuned to the
perceptual modality. For such information processors diagrammatic representations can be more
efficient than sentential ones, but the opposite situation holds for processors that are attuned to
propositional representations, such as computers. Thus, informationally equivalent representations
can have different computational complexity depending on the operations performed on them and
the nature of the underlying information processing architecture that performs the operations
(Larkin & Simon, 1987). This probably explains why it has proved much harder to develop
intelligent diagrammatic systems than intelligent symbolic systems.
Additional resources. Though interest in diagrams goes back a long way, the relatively recent
convergence of the disciplines of psychology, philosophy, linguistics and artificial intelligence
under the umbrella of cognitive science, and society’s increasing reliance on multimedia
information, have provided the necessary impetus for a resurgence of interest in the topic. There
are several resources that the serious reader can now follow to learn more about recent research.
Several books and monographs have been published (Allwein & Barwise, 1996; Glasgow,
Narayanan & Chandrasekaran, 1995; Hammer, 1995; Shin, 1995). There is a world wide web site
devoted to the topic (http://uhavax.hartford.edu/Diagrams) from which one can also access an
electronic discussion list. One journal special issue (Narayanan, 1993) has been published, and
five workshops exclusively devoted to the topic have been held during the last five years;
proceedings of two are available in print (Damski & Narayanan, 1996; Narayanan, 1992) and that
of another is accessible through the web (Blackwell, 1997). Besides, a number of conferences
27
(e.g., Annual Conferences of the Cognitive Science Society, International Joint Conferences on
Artificial Intelligence, AAAI National Conferences on Artificial Intelligence, IEEE Annual
Symposia on Visual Languages) contain papers and tracks on diagrammatic communication.
Acknowledgments. I would like to thank Boicho Kokinov for his invitation to present a tutorial
course on diagrammatic reasoning at the Third International Summer School in Cognitive Science,
and the faculty and students of the Cognitive Science Department, New Bulgarian University for
organizing and running the summer school. Financial support for attending the summer school was
provided by the Open Society Foundation, Sofia; thanks go to Maria Popova. The preparation of
this chapter was supported in part by grants from the Office of Naval Research (contract N0001496-11187) and the National Science Foundation (contract CDA-9616513).
REFERENCES
Akin, O. & Lin, C. (1995). Design protocol data and novel design decisions. Design Studies,
16(2), pp. 211-236.
Allwein, G. & Barwise, J. (Eds.) (1996). Logical Reasoning with Diagrams, New York: Oxford
University Press.
Anderson, M. & McCartney, R. (1995). Inter-diagrammatic reasoning. Proc. 14th International
Joint Conference on Artificial Intelligence (IJCAI-95), Mountain View, CA: Morgan
Kaufmann.
Anderson, M. & McCartney, R. (1997). Learning from diagrams. Machine Graphics and Vision,
in press.
Arnheim, R. (1969). Visual Thinking, Berkeley, CA: University of California Press.
Barwise, J. & Etchemendy, J. (1994). Hyperproof, Cambridge, England: Cambridge University
Press.
Bertin, J. (1981). Graphics and Graphic Information Processing, English translation by W. J .
Berg and P. Scott, Berlin: Walter de Gruyter.
Bertin, J. (1983). Semiology of Graphics: Diagrams, Networks, Maps, English translation by W.
J. Berg, Madison, WI: University of Wisconsin Press.
Biederman, I. (1987). Recognition-by-components. A theory of human image understanding.
Psychological Review, 94, pp. 115-147.
Blackwell, A. F. (1997). Proceedings of the Thinking with Diagrams Workshop, Portsmouth,
England, URL http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html.
Block, N. (1981). Imagery, Cambridge, MA: MIT Press.
Bottoni, P., Costabile, M. F., Levialdi, S. & Mussio, P. (1997). Specification of visual languages
as means for interaction. In K. Marriott and B. Meyer (Eds.), Visual Language Theory,
Berlin: Springer-Verlag.
Brna, P., Cox, R. & Good, J. (1997). Learning to think and communicate with diagrams.
Discussion paper prepared for the Thinking with Diagrams Workshop, Portsmouth, England,
available at http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html.
Brown, M. H. & Sedgewick, R. (1985). Techniques for algorithm animation. IEEE Software,
2(1), pp. 28-38.
Byrne, M. D., Catrambone, R. & Stasko, J. T. (1996). Do algorithm animations aid learning?
Technical Report GIT-GVU-96-18, GVU Center, Georgia Institute of Technology, Atlanta,
GA.
CACM, (1996). Special section on educational technology, Communications of the ACM, 39(4).
Cheng, P. C.-H. (1996a). Scientific discovery with law encoding diagrams. Creativity Research
Journal, 9(2/3), pp. 145-162.
28
Cheng, P. C.-H. (1996b). Law encoding diagrams for instructional systems. Journal of Artificial
Intelligence in Education, 7(1), pp. 33-74.
Cheng, P. C.-H. (1996c). Functional roles for the cognitive analysis of diagrams in problem
solving. Proc. 18th Annual Conference of the Cognitive Science Society, Hillsdale, NJ:
Lawrence Erlbaum, pp. 207-212.
Cornoldi, C. & McDaniel, M. A. (Eds.) (1991). Imagery and Cognition, New York: SpringerVerlag.
Cox, R., Stenning, K. & Oberlander, J. (1995). The effect of graphical and sentential logic
teaching on spontaneous external representation. Cognitive Studies: Bulletin of the Japanese
Cognitive Science Society, 2(4), pp. 56-75.
Damski, J. & Narayanan, N. H. (Eds.) (1996). Proceedings of the AID’96 Workshop on Visual
Representation, Reasoning and Interaction in Design, Key Center for Design Computing,
University of Sydney, Sydney, Australia.
Do, E. Y-L. (1995). What is in a diagram that a computer should understand. The Global Design
Studio: Proc. 6th International Conference on Computer Aided Architectural Design Futures,
Singapore: National University of Singapore, pp. 469-482.
Do, E. Y-L. & Gross, M. D. (1996). Drawing as a means to design reasoning. In N. H .
Narayanan & J. Damski, (Eds.), Proc. AID’96 Workshop on Visual Representation,
Reasoning and Interaction in Design, Key Center for Design Computing, University of
Sydney.
Do, E. Y-L. & Gross, M. D. (1997). Thinking with diagrams in architectural design. Discussion
paper prepared for the Thinking with Diagrams Workshop, Portsmouth, England, available at
http://www.mrc-apu.cam.ac.uk/personal/alan.blackwell/Workshop.html.
Dori, D. (1992). Dimensioning analysis: Toward automatic understanding of engineering
drawings. Communications of the ACM, 35(10), pp. 92-103.
Engelhardt, Y., Bruin, J., Janssen, T. & Scha, R. (1996). The visual grammar of information
graphics. In N. H. Narayanan & J. Damski, (Eds.), Proc. AID’96 Workshop on Visual
Representation, Reasoning and Interaction in Design, Key Center for Design Computing,
University of Sydney.
Finke, R. A. (1990). Creative Imagery: Discoveries and Inventions in Visualization, Cambridge,
MA: MIT Press.
Forbus, K. (1995). Qualitative spatial reasoning: Framework and frontiers. In J. Glasgow, N. H .
Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and
Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press,
pp. 183-204.
Funt, B. V. (1980). Problem solving with diagrammatic representations. Artificial Intelligence, 13,
pp. 201-230.
Gelernter, H. (1963). Realization of a geometry theorem proving machine. In E. A. Feigenbaum &
J. Feldman, (Eds.), Computers and Thought, New York: McGraw Hill, pp. 134-152.
Gelernter, H., Hansen, J. R. & Loveland, D. W. (1963). Empirical explorations of the geometry
theorem proving machine. In E. A. Feigenbaum & J. Feldman, (Eds.), Computers and
Thought, New York: McGraw Hill, pp. 153-163.
Glasgow, J., Narayanan, N. H. & Chandrasekaran, B. (Eds.) (1995). Diagrammatic Reasoning:
Cognitive and Computational Perspectives. Menlo Park, CA: AAAI Press and Cambridge,
MA: MIT Press.
Glinert, E. P. (1990). Visual Programming Environments, Vol. I: Paradigms and Systems, Vol.
II: Applications and Issues, Los Alamitos, CA: IEEE Computer Society Press.
Goel, V. (1995). Sketches of Thought, Cambridge, MA: MIT Press.
29
Goldschmidt, G. (1991). The dialectics of sketching. Creativity Research Journal, 4(2), pp. 123143.
Gombrich, E. H. (1968). Art and Illusion: A Study in the Psychology of Pictorial Representations,
London: Phaidon.
Gooding, D. (1996). Scientific discovery as creative exploration: Faraday’s experiments.
Creativity Research Journal, 9(2/3), pp. 189-205.
Gooding, D. & James, F. J. L. (Eds.) (1985). Faraday Rediscovered: Essays on the Life and
Work of Michael Faraday, 1791-1867, London: Macmillan.
Goodman, N. (1969). Languages of Art: An Approach to a Theory of Symbols, London: Oxford
University Press.
Gross, M. D. (1995). Indexing visual databases of designs with diagrams. In A. Koutamanis, H .
Timmermans, & I. Vermeulen (Eds.), Visual Databases in Architecture, Aldershot, UK:
Avebury, pp. 1-14.
Gross, M. (1996). The electronic cocktail napkin: Computer support for working with diagrams.
Design Studies, 17(1), pp. 53-69.
Gurr, C. A. (1997). On the isomorphism (or otherwise) of representations. In K. Marriott and B.
Meyer (Eds.), Visual Language Theory, Berlin: Springer-Verlag.
Hammer, E. (1995). Logic and Visual Information. Studies in Logic, Language & Computation,
Palo Alto, CA: CSLI Publications, Stanford University.
Hegarty, M. (1992). Mental animation: Inferring motion from static displays of mechanical
systems. Journal of Experimental Psychology: Learning, Memory & Cognition, 18(5), pp.
1084-1102.
Hix, D. & Hartson, H. R. (1993). Developing User Interfaces: Ensuring Usability Through
Product & Process, New York: John Wiley & Sons, Inc.
Hübscher, R. (1997). Visual constraint rules. Journal of Visual Languages and Computing, in
press.
Huttenlocher, J. (1968). Constructing spatial images: A strategy in reasoning. Psychological
Review, 75(6), pp. 550-560.
IEEE. (serial). Proceedings of the IEEE Annual Symposium on Visual Languages, Los Alamitos,
CA: IEEE Computer Society Press.
Joseph, S. H. & Pridmore, T. P. (1992). Knowledge-directed interpretation of mechanical
engineering drawings. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14(9), pp.
928-940.
JVLC. (serial). Journal of Visual Languages and Computing, London: Academic Press.
Kim, M. Y. (1989). Visual reasoning in geometry theorem proving. Proc. 11th International Joint
Conference on Artificial Intelligence, Mountain View, CA: Morgan Kaufmann, pp. 16171622.
Koedinger, K. R. & Anderson, J. R. (1990). Abstract planning and perceptual chunks: Elements
of expertise in geometry. Cognitive Science, 14, pp. 511-550.
Kosslyn, S. M. (1980). Image and Mind, Cambridge, MA: Harvard University Press.
Kosslyn, S. M. (1981). The medium and the message in mental imagery: A Theory. Psychological
Review, 88(1), pp. 46-66.
Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Imagery Debate, Cambridge, MA:
MIT Press.
Kulpa, Z. (1994). Diagrammatic representation and reasoning. Machine Graphics and Vision,
3(1/2), pp. 77-103.
Larkin, J. H. & Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words.
Cognitive Science, 11, pp. 65-99.
30
Lawrence, A. W., Badre, A. M. & Stasko, J. T. (1994). Empirically evaluating the use of
animations to teach algorithms. Proc. IEEE Symposium on Visual Languages, Los Alamitos,
CA: IEEE Computer Society Press, pp. 48-54.
Lindsay, R. K. (in press). Using diagrams to understand geometry. Computational Intelligence.
Lohse, G. I., Biolsi, K., Walker, N. & Rueler, H. H. (1994). A classification of visual
representations. Communications of the ACM, 37(12), pp. 36-49.
Lowe, R. K. (1989). Search strategies and inference in the exploration of scientific diagrams.
Educational Psychology, 9, pp. 27-44.
Lowe, R. K. (1993). Constructing a mental representation from an abstract technical diagram.
Learning and Instruction, 3, pp. 157-179.
Lowe, R. K. (1994a). Selectivity in diagrams: Reading beyond the lines. Educational Psychology,
14, pp. 467-491.
Lowe, R. K. (1994b). Diagram prediction and higher order structures in mental representation.
Research in Science Education, 24, pp. 208-216.
Lowe, R. K. (1996a). Background knowledge and the construction of a situational representation
from a diagram. European Journal of Psychology of Education, 11, pp. 377-397.
Lowe, R. K. (1996b). Interactive animated diagrams: What information is extracted? Proc. Using
Complex Information Systems Symposium, University of Poitiers, Poitiers, France, pp. 4045.
Marr, D. & Nishihara, H. K. (1978). Representation and recognition of the spatial organization of
three-dimensional shapes. Proceedings of the Royal Society, Vol. B 200, pp. 269-294.
Marriott, K. & Meyer, B. (Eds.) (1997). Visual Language Theory, Berlin: Springer-Verlag.
Marriott, K. Meyer, B. & Wittenberg, K. (1997). A survey of visual language specification and
recognition. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin: SpringerVerlag.
Mayer, R. E. & Sims, V. K. (1994). For whom is a picture worth a thousand words? Extensions
of a dual-coding theory of multimedia learning. Journal of Educational Psychology, 86, pp.
389-401.
McDougal, T. F. & Hammond, K. J. (1993). Representing and using procedural knowledge to
build geometry proofs. Proc. 11th National Conference on Artificial Intelligence, AAAI’94,
Palo Alto, CA: AAAI Press.
Mullet, K & Sano, D. (1995). Designing Visual Interfaces. SunSoft Press, Englewood Cliffs, NJ:
Prentice Hall PTR.
Narayanan, N. H. (Ed.) (1992). Proc. AAAI Spring Symposium on Reasoning with
Diagrammatic Representations, AAAI Technical Report SS-92-02, Menlo Park, CA: AAAI
Press.
Narayanan, N. H. (Ed.) (1993). Special issue on computational imagery. Computational
Intelligence, 9(4).
Narayanan, N. H. & Chandrasekaran, B. (1991). Reasoning visually about spatial interactions.
Proc. 12th International Joint Conference on Artificial Intelligence, Mountain View, CA:
Morgan Kaufmann, pp. 360-365.
Narayanan, N. H. & Hübscher, R. (1997). Visual language theory: Toward a human-computer
interaction perspective. In K. Marriott and B. Meyer (Eds.), Visual Language Theory, Berlin:
Springer-Verlag.
Narayanan, N. H. & Hegarty, M. (1997). On designing comprehensible interactive hypermedia
manuals. Under review.
Narayanan, N. H., Suwa, M. & Motoda, H. (1994a). How things appear to work: Predicting
behaviors from device diagrams. Proc. 12th National Conference on Artificial Intelligence,
Menlo Park, CA: AAAI Press, pp. 1161-1167.
31
Narayanan, N. H., Suwa, M. & Motoda, H. (1994b). A study of diagrammatic reasoning from
verbal and gestural protocols. Proc. 16th Annual Conference of the Cognitive Science
Society, Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 652-657.
Narayanan, N. H., Suwa, M. & Motoda, H. (1995a). Diagram-based problem solving: The case
of an impossible problem. Proc. 17th Annual Conference of the Cognitive Science Society,
Hillsdale, NJ: Lawrence Erlbaum Associates, pp. 206-211.
Narayanan, N. H., Suwa, M. & Motoda, H. (1995b). Behavior hypothesis from schematic
diagrams. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic
Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and
Cambridge, MA: MIT Press, pp. 501 -534.
Nersessian, N. (1995). How do scientists think? Capturing the dynamics of conceptual change in
science. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic
Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and
Cambridge, MA: MIT Press, pp. 137-182.
Nersessian, N. (1997). Abstraction via generic modeling in concept formation in science. In N .
Cartwright and M. R. Jones (Eds.), Correcting the Model: Abstraction and Idealization in
Science, Amsterdam: Editions Rodopi.
Novak, G. S. & Bulko, W. C. (1993). Diagrams and text as computer input. Journal of Visual
Languages and Computing, 4, pp. 161-175.
Paivio, A. (1971). Imagery and Verbal Processes, New York: Holt, Rinehart and Winston.
Petre, M., Blackwell, A. F. & Green, T. R. G. (1997). Cognitive questions in software
visualization. In J. Stasko, J. Domingue, B. Price & M. Brown, (Eds.), Software
Visualization: Programming as a Multi-Media Experience, Boston, MA: MIT Press, in press.
Price, B. A., Baecker, R. M. & Small, I. S. (1993). A principled taxonomy of software
visualization. Journal of Visual Languages and Computing, 4(3), pp. 211-266.
Pylyshyn, Z. (1981). The imagery debate: analog media versus tacit knowledge. Psychological
Review, 88(1), pp. 16-45.
Qin, Y. & Simon, H. A. (1995). Imagery and mental models in problem solving. In J. Glasgow,
N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning: Cognitive and
Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge, MA: MIT Press,
pp. 403-434.
Repenning, A. (1995). Bending the rules: Steps toward semantically enriched graphical rewrite
rules. Proc. IEEE Symposium on Visual Languages, Los Alamitos, CA: IEEE Computer
Society Press, pp. 226-233.
Repenning, A. & Sumner, T. (1995). Agentsheets: A medium for creating domain-oriented visual
languages. IEEE Computer, 28, pp. 17-25.
Resnick, M. (1996). Beyond the centralized mindset. Journal of the Learning Sciences, 5(1), pp.
1-22.
Roskos-Ewoldsen, B., Intons-Peterson, M. & Anderson, R. E. (Eds.) (1993). Imagery,
Creativity and Discovery: A Cognitive Perspective, Amsterdam: North Holland.
Russell, B. (1923). Vagueness. In J. Slater (Ed.), Essays on Language, Mind, and Matter 19191926, The Collected Papers of Bertrand Russell, London: Unwin Hyman, pp. 145-154.
Schwartz, D. L. & Black, J. B. (1996a). Analog imagery in mental model reasoning: Depictive
models. Cognitive Psychology, 30, pp. 154-219.
Schwartz, D. L. & Black, J. B. (1996b). Shuttling between depictive models and rules: Induction
and fallback. Cognitive Science, 20(4), pp. 457-498.
Schwartz, D. L. & Hegarty, M. (1996). Coordinating multiple representations for reasoning about
mechanical devices. In P. Olivier, (Ed.). Cognitive and Computational Models of Spatial
32
Representation, AAAI Spring Symposia Technical Report SS-96-03, Menlo Park, CA: AAAI
Press.
Shah, P. (1995). Cognitive Processes in Graph Comprehension. Unpublished Doctoral
Dissertation, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA.
Shin, S -J. (1995). The Logical Status of Diagrams, Cambridge, England: Cambridge University
Press.
Shneiderman, B. (1983). Direct manipulation: A step beyond programming languages. IEEE
Computer, 16(8), pp. 57-69.
Shneiderman, B. (1997). Designing the User Interface: Strategies for Effective Human-Computer
Interaction, Second Edition, Reading, MA: Addison-Wesley.
Sloman, A. (1975). Afterthoughts on analogical representations. Reprinted in R. J. Brachman and
H. J. Levesque (Eds.), Readings in Knowledge Representation, San Mateo, CA: Morgan
Kaufmann, 1985, pp. 432-439.
Smith, D. C. (1977). Pygmalion: A Computer Program to Model and Simulate Creative Thought,
Boston, MA: Birkhauser.
Smith, D. C., Cypher, A. & Spohrer, J. (1994). Kidsim: Programming agents without a
programming language. Communications of the ACM, 37, pp. 54-68.
Stasko, J. T., Badre, A. M. & Lewis, C. (1993). Do algorithm animations assist learning? An
empirical study and analysis. Proc. INTERCHI’93 Conference on Human Factors in
Computing Systems, New York: ACM Press, pp. 61-66.
Stenning, K., Cox, R. & Oberlander, J. (1995). Contrasting the cognitive effects of graphical and
sentential logic teaching: Reasoning, representation and individual differences. Language and
Cognitive Processes, 10(3/4), pp. 333-354.
Stenning, K. & Lemon, O. (1997). Diagrams and human reasoning: aligning logical and
psychological perspectives. Discussion paper prepared for the Thinking with Diagrams
Workshop,
Portsmouth,
England,
available
at
http://www.mrcapu.cam.ac.uk/personal/alan.blackwell/Workshop.html.
Stenning, K. & Oberlander, J. (1995). A cognitive theory of graphical and linguistic reasoning:
Logic and implementation. Cognitive Science, 19, pp. 97-140.
Suwa, M. & Tversky, B. (1997). What do architects and students perceive in their design
sketches? A protocol analysis. Design Studies, 18(3), in press.
Tessler, S., Iwasaki, Y. & Law, K. (1995). Qualitative structural analysis using diagrammatic
reasoning. In J. Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic
Reasoning: Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and
Cambridge, MA: MIT Press, pp. 711-730.
Tufte, E. R. (1983). The Visual Display of Quantitative Information, Graphics Press, Cheshire,
CT.
Tufte, E. R. (1990). Envisioning Information, Graphics Press, Cheshire, CT.
Tufte, E. R. (1997). Visual Explanations, Graphics Press, Cheshire, CT.
Tversky, B. (1995). Cognitive origins of graphic productions. In F. T. Marchese (Ed.),
Understanding Images: Finding Meaning in Digital Imagery, Springer-Verlag, New York, pp.
29-53.
Tye, M. (1991). The Imagery Debate, Cambridge, MA: MIT Press.
van Dijk, T. A. & Kintsch, W. (1983). Strategies of Discourse Comprehension, New York:
Academic Press.
Van Sommers, P. (1984). Drawing and Cognition, Cambridge, England: Cambridge University
Press.
33
Vaxiviere, P. & Tombre, K. (1992). Celesstin: CAD conversion of mechanical drawings. IEEE
Computer, 25(7), pp. 46-54.
Wang, D. (1995). Studies on the Formal Semantics of Pictures. Doctoral Dissertation, Institute for
Logic, Language and Computation, University of Amsterdam.
Wang, D. & Lee, J. R. & Zeevat, H. (1995). Reasoning with diagrammatic representations. In J .
Glasgow, N. H. Narayanan, & B. Chandrasekaran, (Eds.), Diagrammatic Reasoning:
Cognitive and Computational Perspectives, Menlo Park, CA: AAAI Press and Cambridge,
MA: MIT Press, pp. 339-396.
West, T. G. (1991). In the Mind’s Eye, Buffalo, NY: Prometheus Books.
Yip, K. M. (1991). Understanding complex dynamics by visual and symbolic reasoning. Artificial
Intelligence, 51, pp. 179-221.
Yuille, J. C. (Ed.) (1983). Imagery, Memory and Cognition: Essays in Honor of Allan Paivio,
Hillsdale, NJ: Lawrence Erlbaum.
34