Representation and Training of Vector Graphics with RAAM Networks Mark Schaefer and Werner Dilger Chemnitz University of Technology Chemnitz, Germany Email: [email protected], [email protected] Abstract— Recursive auto-associative memory (RAAM) networks are neural networks that can be trained to represent structured information. After training, this information can be retrieved following its inner structure. By now, RAAM networks were applied only to syntactical expressions like parse trees of natural language sentences or logical terms. In this paper it is shown how they can be used for representing vector graphics that are given in tree like notation. For this purpose we developed Named RAAM networks which are more suitable for the training of complex information than normal RAAMs. output layer slot 2 slot 1 1 n 2 1 slot k n 2 1 n 2 encoder 1 n 2 representation layer decoder I. I NTRODUCTION Recursive auto-associative memory (RAAM) networks were introduced by J.B. Pollack (1988) for the purpose of learning representations of structured objects that do not increase with the size of the structures represented. RAAM networks were typically applied to syntactical structures like parse trees of natural language sentences or logical expressions (cf. Neumann 2001). Since the depth of the represented structures can vary, it is necessary to distinguish between a composed element and a single element. Different methods were suggested to do this, e.g. distinguishing between binary vectors for elements and real valued vectors for composed objects (Plate 1997), introducing an extra bit to separate between both (Niklasson 1999), or (a more general approach) extending the RAAM architecture with a label (Sperduti et al. 1995). In this paper we will show how an extension of RAAM networks, the so called Named RAAM networks can be used to represent and learn graphical structures, in particular vector graphics. These structures can be viewed as hierarchically constructed, starting with the coordinates of points as basic elements and proceeding with lines and circles up to complex graphics composed of such elements. For this reason they can be treated like parse trees or logical expressions. The name tag is required to represent the different types of elements that occur in vector graphics. In section 2 of this paper the RAAM networks as defined by Pollack are introduced. Section 3 describes the extension to NRAAM networks that are used for the representation of vector graphics. In section 4 it is described how this representation is done and how the NRAAMs learn the graphical structures, and the results of some training examples are presented. II. RAAM N ETWORKS A. Structure Recursive auto-associative memory networks (RAAM) were developed 1988 by Pollack [1]. Figure 1 shows their basic structure. 1 n 2 slot 1 1 n 2 slot 2 1 n 2 slot k input layer Fig. 1. RAAM basic structure A RAAM is a three-layer feed-forward network. The input and output layer consists of k slots, the middle layer of only one. Each slot has n positions (or bits) and is independent from the other ones in the same layer. For short this is noted as n-k-RAAM network or just n-k-RAAM. The input neurons together with the hidden neurons are called encoder and the hidden neurons and the output neurons are called decoder. A RAAM is trained as an auto-associator, i.e. the network should reproduce each input vector at the output layer. When a training example has been learned successfully one can use the activation of the hidden neurons as a representation of the input/output vector. The input vector is encoded by the first layer of weights and the representation is decoded by the second layer. B. Structured Representation After one knows a representation of an input vector it can be used as a part of another input vector, which is encoded into its own representation and so on. This is the essential feature of RAAMs. It can be used for subsymbolic representation of trees and their holistic modification, i.e. the modification of the whole RAAM network at once in order to gain another RAAM network which represents different information. It is important to bear in mind that the representation of every node changes during each training cycle which effects dependent nodes. Figure 2 shows an example of tree like a syntactical structure that can be represented by an n-2-RAAM network. The tree is transformed into the training examples given in table I (a name in parentheses means the vector representation of this node). D output layer slot 2 slot 1 name B 1 n 2 name 1 slot k n 2 name 1 2 n C decoder c A a e d name 1 n 2 hidden layer encoder b name Fig. 2. Example of a RAAM tree 1 n 2 name 1 slot 1 n 2 slot 2 name 1 2 n slot k input layer TABLE I T RAINING EXAMPLES input vector (a b) (A c) (d e) (B C) representation layer (A) (B) (C) (D) Fig. 3. output vector (a b) (A c) (d e) (B C) A 8-2-RAAM has developed the representations of the nodes in table II. The initial coding of the leafs was made randomly. For better reading vector representations are always denoted as binary vectors, independent from the encoding for the implementation. Once a RAAM has learned a tree only the decoder part is needed to unfold it completely. Starting with the representation of the topmost node (for the given example D) - which is now used as an input for the decoder network - all nodes are reconstructed. NRAAM basic structure III. NRAAM N ETWORKS For the purpose of representing structured information, it is necessary to distinguish between the different types of nodes in the tree to be learned. Especially vector graphics need many elements like points, lines, circles, etc. Therefore a more powerful scheme is needed, which can deal with supplementary information in each node. Named RAAM networks (NRAAM) have this property. A. Structure and Training The concept of named NRAAM networks is an extension of standard RAAM networks. As one can see from fig. 3 each slot has been extended by a name, which is a vector that characterizes the content of the accompanying information in the slot. C. The reconstruction problem Unfortunately the following problem occurs. Assume a concrete decoder network is given, along with the representation of the topmost node. After decoding this node one gains the corresponding child nodes, which may be decoded also and so on, but it is unknown which node is a leaf and must not be decoded. Furthermore, also the type of the decoded data is unknown. For this reason, pure RAAM networks are insufficient for representing large structures with different types of information and an improved RAAM network is needed. The difference to RAAM networks is the asymmetric connection of the representation layer to the other ones. The representation of a name is set as a part of the input vector of the training example. Therefore it can be defined expliciteley for each node of a tree. The decoder part is fully connected as it is in RAAM networks. This is denoted as m-n-k-NRAAM where m stands for the width of the representation of a name. Although the training of NRAAM networks is a bit more sophisticated due to the extended structure of the slots, they are well suited for the representation of tree like structures. IV. T RAINING OF V ECTORGRAPHICS TABLE II L EARNED R EPRESENTATIONS node name a b c d e A B C D representation 10011100 11000101 01011101 11110110 10101110 00101101 10101111 01011110 00001110 In this section the suitability of NRAAMs for representing complex structured information is demonstrated. The notation of a vector graphic as it is used here is introduced, followed by an example. The training process is described and the results for some example graphics are presented. A. Notation For the notation of vector graphic trees the structures and corresponding symbols of figure 4 are used. • structure element: A general node of a NRAAM tree which may be used to combine any kind of subtrees. x2 n start element structure element stop element coordinate positiv number C K D n point by line and ratio point by coordinates b point by two lines a line by points B c A n circle by point and radius x1 circle by two points Fig. 4. circle by three points Fig. 5. A Notation of vectorgraphics Example Graphic B D C D 15 2 • • • • • • • • • • • start element: The topmost element of an NRAAM tree. stop element: An element which must not be unfolded like other leaf elements, used to fill empty positions. coordinate: A single coordinate of a point. positive number: Although coordinates are assumed to be positive numbers also, it is distinguished between them, because they fill different roles. point by coordinates: A point combined from its coordinates. point by line and ratio: A point given by a line and the ratio of the distance to an end point of the line and the length of the line. It is given in units of 2 −n . Therefore the accuracy of the ratio is depends on the width of the used representation vector n. This structure is useful to describe points which should be attached to a line. point by two lines: A point defined by the crosspoint of two lines. line by points: A line defined by its end and start point. circle by center and radius circle by two points: The left child is the center and the right one the radius of the circle. circle by three points: A circle defined by three points which must not lie on a line. The structure element is used to group three elements. B. The Vector Graphic Tree The vector graphic tree is built up in the following way: Each element which should appear in the graphic is linked in a linear list and split upside-down into its components. In order to simplify notation the structure of each element is defined once and when it is needed only the element without its inner structure is noted. Bear in mind that from a logical view an element appears every time completely in the NRAAM tree and that an element may be used for training several times 4 10 6 10 10 10 a c b a 14 a B C A C A B b K K K c 4 D B D C Fig. 6. D K Example Graphic Tree during each training cycle, once for every appearance. This is illustrated in figures 5 and 6. The three points A, B and C can only noted in one way. For D there are two possibilities: One can define it by its coordinates or by the line a the ratio 8 (for 4 bit wide slots). The lines a, b, c are defined by their start and end points. The circle K can defined by three ways and always is D the center: B or C could be element of the circle, or the radius is given directly. Notice that each point is used twice or even three times, but only defined once. The NRAAM tree itself is noted as a linear list which contains the visible elements a, b, c and K. The resulting tree can be used directly as a set of training examples as it was shown in tab. I. C. Training Specifications For the representation of the vectorgraphics a bipolar encoding is used. Network training was made by backpropagation with tanh as the activation function. During decoding of a trained tree only the sign of the activation of the output vector was used to gain the learned information. Due to this only a few training cycles were needed for encoding. In the following examples visible elements are drawn in black and invisible ones in grey. 32 x2 32 x2 30 28 structure 5-5-2 weight decay 0.1 training rate 0.1 cycles 10 26 24 22 20 18 16 30 14 28 structure 2-5-2 weight decay 0.1 training rate 0.1 cycles 20 26 24 22 20 18 16 14 12 10 8 6 4 10 8 6 4 2 x1 0 0 32 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 x2 30 28 structure 7-24-2 weight decay 0.0001 training rate 0.0001 cycles 2000 26 24 2 22 x1 0 0 32 12 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 20 32 x2 18 16 30 14 28 structure 5-6-2 weight decay 0.1 training rate 0.1 cycles 20 26 24 22 20 18 16 14 12 10 8 6 4 2 x1 0 0 32 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 x2 30 28 structure 5-5-2 weight decay 0.1 training rate 0.1 cycles 20 26 24 22 20 18 16 14 12 10 8 6 4 2 x1 0 0 32 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 x2 30 28 structure 5-5-2 weight decay 0.1 training rate 0.1 cycles 20 26 24 22 20 18 16 14 12 10 8 6 4 2 x1 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 12 10 8 6 4 2 x1 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 V. C ONCLUSION We have shown in our paper how graphical structures can be represented by an augmented form of RAAM networks and how the network can be trained. The extension by a special name tag is well suited for this purpose as the experiments show. In our future work we will try to learn transformations of such structures into equivalent ones by shifting or rotation, as it was done by Pollack and others for textual structures. In this way we hope that it will be possible to detect similar structures in pictures and to classify them. R EFERENCES [1] Pollack, J. B., (1988): Recursive auto.associative memory: Devising compositional distributed representations, Proceeding of the 10th Annual Conference of the Cognitive Science Society, 33-39, Hillsdale, NJ. Erlbaum [2] Plate, T. (1997): A common framework for distributed representation schemes for compositional structure. In Maire, F., Hayward, R. and Diederich, J., editors, Connectionist Systems for Knowledge Representation and Deduction, pages 15-34 [3] Niklasson, L. (1999): Extended encoding/decoding of embedded structures using connectionist networks. In Proceedings of the 9th International Conference on Artificial Neural Networks (ICANN99), Edinburgh, Scotland [4] Neumann, Jane (2001): Holistic Processing of Hierarchical Structures in Connectionist Networks, PhD, University of Edinburgh [5] Rummelhart, D.E. and McClelland, J.L. (1986): On learning past tenses of English verbs. In Rummlehart. D.E. and McClelland, J.L. editors, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models, pages 216271, MIT Press, Cambridge, MA. [6] Sperdutti, A. (1994): Labeling RAAM, Connection Science, 6(4):429-459
© Copyright 2026 Paperzz