Representation and Training of Vector Graphics with RAAM Networks

Representation and Training of Vector Graphics with RAAM Networks
Mark Schaefer and Werner Dilger
Chemnitz University of Technology
Chemnitz, Germany
Email: [email protected], [email protected]
Abstract— Recursive auto-associative memory (RAAM) networks are neural networks that can be trained to represent
structured information. After training, this information can be
retrieved following its inner structure. By now, RAAM networks
were applied only to syntactical expressions like parse trees of
natural language sentences or logical terms. In this paper it is
shown how they can be used for representing vector graphics that
are given in tree like notation. For this purpose we developed
Named RAAM networks which are more suitable for the training
of complex information than normal RAAMs.
output layer
slot 2
slot 1
1
n
2
1
slot k
n
2
1
n
2
encoder
1
n
2
representation layer
decoder
I. I NTRODUCTION
Recursive auto-associative memory (RAAM) networks were
introduced by J.B. Pollack (1988) for the purpose of learning
representations of structured objects that do not increase with
the size of the structures represented. RAAM networks were
typically applied to syntactical structures like parse trees of
natural language sentences or logical expressions (cf. Neumann 2001). Since the depth of the represented structures
can vary, it is necessary to distinguish between a composed
element and a single element. Different methods were suggested to do this, e.g. distinguishing between binary vectors
for elements and real valued vectors for composed objects
(Plate 1997), introducing an extra bit to separate between both
(Niklasson 1999), or (a more general approach) extending
the RAAM architecture with a label (Sperduti et al. 1995).
In this paper we will show how an extension of RAAM
networks, the so called Named RAAM networks can be used
to represent and learn graphical structures, in particular vector
graphics. These structures can be viewed as hierarchically
constructed, starting with the coordinates of points as basic
elements and proceeding with lines and circles up to complex
graphics composed of such elements. For this reason they can
be treated like parse trees or logical expressions. The name
tag is required to represent the different types of elements
that occur in vector graphics. In section 2 of this paper
the RAAM networks as defined by Pollack are introduced.
Section 3 describes the extension to NRAAM networks that
are used for the representation of vector graphics. In section
4 it is described how this representation is done and how the
NRAAMs learn the graphical structures, and the results of
some training examples are presented.
II. RAAM N ETWORKS
A. Structure
Recursive auto-associative memory networks (RAAM) were
developed 1988 by Pollack [1]. Figure 1 shows their basic
structure.
1
n
2
slot 1
1
n
2
slot 2
1
n
2
slot k
input layer
Fig. 1.
RAAM basic structure
A RAAM is a three-layer feed-forward network. The input
and output layer consists of k slots, the middle layer of only
one. Each slot has n positions (or bits) and is independent
from the other ones in the same layer. For short this is noted
as n-k-RAAM network or just n-k-RAAM.
The input neurons together with the hidden neurons are
called encoder and the hidden neurons and the output neurons
are called decoder. A RAAM is trained as an auto-associator,
i.e. the network should reproduce each input vector at the
output layer. When a training example has been learned
successfully one can use the activation of the hidden neurons
as a representation of the input/output vector. The input vector
is encoded by the first layer of weights and the representation
is decoded by the second layer.
B. Structured Representation
After one knows a representation of an input vector it can
be used as a part of another input vector, which is encoded into
its own representation and so on. This is the essential feature
of RAAMs. It can be used for subsymbolic representation of
trees and their holistic modification, i.e. the modification of the
whole RAAM network at once in order to gain another RAAM
network which represents different information. It is important
to bear in mind that the representation of every node changes
during each training cycle which effects dependent nodes.
Figure 2 shows an example of tree like a syntactical
structure that can be represented by an n-2-RAAM network.
The tree is transformed into the training examples given in
table I (a name in parentheses means the vector representation
of this node).
D
output layer
slot 2
slot 1
name
B
1
n
2
name
1
slot k
n
2
name
1
2
n
C
decoder
c
A
a
e
d
name
1
n
2
hidden layer
encoder
b
name
Fig. 2.
Example of a RAAM tree
1
n
2
name
1
slot 1
n
2
slot 2
name
1
2
n
slot k
input layer
TABLE I
T RAINING EXAMPLES
input vector
(a b)
(A c)
(d e)
(B C)
representation layer
(A)
(B)
(C)
(D)
Fig. 3.
output vector
(a b)
(A c)
(d e)
(B C)
A 8-2-RAAM has developed the representations of the
nodes in table II. The initial coding of the leafs was made
randomly. For better reading vector representations are always
denoted as binary vectors, independent from the encoding for
the implementation.
Once a RAAM has learned a tree only the decoder part is
needed to unfold it completely. Starting with the representation
of the topmost node (for the given example D) - which is
now used as an input for the decoder network - all nodes are
reconstructed.
NRAAM basic structure
III. NRAAM N ETWORKS
For the purpose of representing structured information, it is
necessary to distinguish between the different types of nodes
in the tree to be learned. Especially vector graphics need many
elements like points, lines, circles, etc.
Therefore a more powerful scheme is needed, which can
deal with supplementary information in each node. Named
RAAM networks (NRAAM) have this property.
A. Structure and Training
The concept of named NRAAM networks is an extension
of standard RAAM networks. As one can see from fig. 3 each
slot has been extended by a name, which is a vector that
characterizes the content of the accompanying information in
the slot.
C. The reconstruction problem
Unfortunately the following problem occurs. Assume a concrete decoder network is given, along with the representation
of the topmost node. After decoding this node one gains the
corresponding child nodes, which may be decoded also and
so on, but it is unknown which node is a leaf and must not
be decoded. Furthermore, also the type of the decoded data is
unknown.
For this reason, pure RAAM networks are insufficient for
representing large structures with different types of information and an improved RAAM network is needed.
The difference to RAAM networks is the asymmetric connection of the representation layer to the other ones. The
representation of a name is set as a part of the input vector of
the training example. Therefore it can be defined expliciteley
for each node of a tree. The decoder part is fully connected as
it is in RAAM networks. This is denoted as m-n-k-NRAAM
where m stands for the width of the representation of a name.
Although the training of NRAAM networks is a bit more
sophisticated due to the extended structure of the slots, they
are well suited for the representation of tree like structures.
IV. T RAINING OF V ECTORGRAPHICS
TABLE II
L EARNED R EPRESENTATIONS
node name
a
b
c
d
e
A
B
C
D
representation
10011100
11000101
01011101
11110110
10101110
00101101
10101111
01011110
00001110
In this section the suitability of NRAAMs for representing
complex structured information is demonstrated. The notation
of a vector graphic as it is used here is introduced, followed by
an example. The training process is described and the results
for some example graphics are presented.
A. Notation
For the notation of vector graphic trees the structures and
corresponding symbols of figure 4 are used.
• structure element: A general node of a NRAAM tree
which may be used to combine any kind of subtrees.
x2
n
start
element
structure
element
stop
element
coordinate
positiv
number
C
K
D
n
point by line
and ratio
point by
coordinates
b
point by two lines
a
line by
points
B
c
A
n
circle by point
and radius
x1
circle by
two points
Fig. 4.
circle by
three points
Fig. 5.
A
Notation of vectorgraphics
Example Graphic
B
D
C
D
15
2
•
•
•
•
•
•
•
•
•
•
•
start element: The topmost element of an NRAAM tree.
stop element: An element which must not be unfolded
like other leaf elements, used to fill empty positions.
coordinate: A single coordinate of a point.
positive number: Although coordinates are assumed to be
positive numbers also, it is distinguished between them,
because they fill different roles.
point by coordinates: A point combined from its coordinates.
point by line and ratio: A point given by a line and the
ratio of the distance to an end point of the line and the
length of the line. It is given in units of 2 −n . Therefore
the accuracy of the ratio is depends on the width of the
used representation vector n. This structure is useful to
describe points which should be attached to a line.
point by two lines: A point defined by the crosspoint of
two lines.
line by points: A line defined by its end and start point.
circle by center and radius
circle by two points: The left child is the center and the
right one the radius of the circle.
circle by three points: A circle defined by three points
which must not lie on a line. The structure element is
used to group three elements.
B. The Vector Graphic Tree
The vector graphic tree is built up in the following way:
Each element which should appear in the graphic is linked in
a linear list and split upside-down into its components. In order
to simplify notation the structure of each element is defined
once and when it is needed only the element without its inner
structure is noted. Bear in mind that from a logical view an
element appears every time completely in the NRAAM tree
and that an element may be used for training several times
4
10
6
10
10
10
a
c
b
a
14
a
B
C
A
C
A
B
b
K
K
K
c
4
D
B
D
C
Fig. 6.
D
K
Example Graphic Tree
during each training cycle, once for every appearance.
This is illustrated in figures 5 and 6. The three points A,
B and C can only noted in one way. For D there are two
possibilities: One can define it by its coordinates or by the
line a the ratio 8 (for 4 bit wide slots). The lines a, b, c are
defined by their start and end points. The circle K can defined
by three ways and always is D the center: B or C could be
element of the circle, or the radius is given directly. Notice
that each point is used twice or even three times, but only
defined once. The NRAAM tree itself is noted as a linear list
which contains the visible elements a, b, c and K.
The resulting tree can be used directly as a set of training
examples as it was shown in tab. I.
C. Training Specifications
For the representation of the vectorgraphics a bipolar encoding is used. Network training was made by backpropagation
with tanh as the activation function. During decoding of a
trained tree only the sign of the activation of the output vector
was used to gain the learned information. Due to this only a
few training cycles were needed for encoding.
In the following examples visible elements are drawn in
black and invisible ones in grey.
32
x2
32
x2
30
28
structure
5-5-2
weight decay
0.1
training rate
0.1
cycles
10
26
24
22
20
18
16
30
14
28
structure
2-5-2
weight decay
0.1
training rate
0.1
cycles
20
26
24
22
20
18
16
14
12
10
8
6
4
10
8
6
4
2
x1
0
0
32
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
x2
30
28
structure
7-24-2
weight decay
0.0001
training rate
0.0001
cycles
2000
26
24
2
22
x1
0
0
32
12
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
20
32
x2
18
16
30
14
28
structure
5-6-2
weight decay
0.1
training rate
0.1
cycles
20
26
24
22
20
18
16
14
12
10
8
6
4
2
x1
0
0
32
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
x2
30
28
structure
5-5-2
weight decay
0.1
training rate
0.1
cycles
20
26
24
22
20
18
16
14
12
10
8
6
4
2
x1
0
0
32
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
x2
30
28
structure
5-5-2
weight decay
0.1
training rate
0.1
cycles
20
26
24
22
20
18
16
14
12
10
8
6
4
2
x1
0
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
12
10
8
6
4
2
x1
0
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
V. C ONCLUSION
We have shown in our paper how graphical structures can
be represented by an augmented form of RAAM networks and
how the network can be trained. The extension by a special
name tag is well suited for this purpose as the experiments
show. In our future work we will try to learn transformations
of such structures into equivalent ones by shifting or rotation,
as it was done by Pollack and others for textual structures.
In this way we hope that it will be possible to detect similar
structures in pictures and to classify them.
R EFERENCES
[1] Pollack, J. B., (1988): Recursive auto.associative memory: Devising compositional distributed representations, Proceeding of the 10th Annual
Conference of the Cognitive Science Society, 33-39, Hillsdale, NJ. Erlbaum
[2] Plate, T. (1997): A common framework for distributed representation
schemes for compositional structure. In Maire, F., Hayward, R. and
Diederich, J., editors, Connectionist Systems for Knowledge Representation
and Deduction, pages 15-34
[3] Niklasson, L. (1999): Extended encoding/decoding of embedded structures using connectionist networks. In Proceedings of the 9th International
Conference on Artificial Neural Networks (ICANN99), Edinburgh, Scotland
[4] Neumann, Jane (2001): Holistic Processing of Hierarchical Structures in
Connectionist Networks, PhD, University of Edinburgh
[5] Rummelhart, D.E. and McClelland, J.L. (1986): On learning past tenses
of English verbs. In Rummlehart. D.E. and McClelland, J.L. editors,
Parallel Distributed Processing: Explorations in the Microstructure of
Cognition, Volume 2: Psychological and Biological Models, pages 216271, MIT Press, Cambridge, MA.
[6] Sperdutti, A. (1994): Labeling RAAM, Connection Science, 6(4):429-459