Evolutionary Strategy for NN-controlled Variable Geometry Truss

Evolutionary Strategy for NN-controlled Variable Geometry Truss
using a “Follow the Leader” strategy.
By Ira Dunn
November 25, 2002
Computer Science 447
Advanced Topics in Artificial Intelligence
Keywords: variable geometry truss, evolutionary strategies, robot manipulator
Abstract
The Variable Geometry Truss, or VGT, has been implemented in many recent papers to
simulate life-like motive work in robot appendages. A Neural Network approach to VGT
training and learning has been successfully implemented as well. It is possible to
combine both strategies to create a hybrid VGT with neural activity and evolve-ability.
This particular manipulator will implement a “Follow the Leader” strategy for path
planning and obstacle avoidance. This VGT configuration will be trainable on a given
workspace, executable in real-time, and adaptable to changing environments via
evolution.
Serpentine Manipulator arm, courtesy of NASA:
http://ranier.hq.nasa.gov/Telerobotics_Page/Photos/SerpentineManipulator.jpg
Introduction
The field of robotics is an exploration of moving and non-moving mechanical parts. Like
the bones in the human arm, a robot’s arm must have a solid structural frame. This
framework is used in industry applications to support an end-effector, such as a welding
gun or a pick-and-place machine. The frame is moved via hydraulics or electromechanical servo motors, which are directly controlled by a central computer. The
Page 2 of 12
computer tells the robot where to place the end-effector, from what angle it is to approach
its goal, and what action it is to take (weld or not-weld, etc).
A new strategy has been suggested [7], where the structure itself acts as its own
manipulator. This has been called a Variable Geometry Truss, or VGT. The VGT
contains both static structural members that have fixed lengths and dynamic members
with computer-variable lengths. The process of motion control is then the process of
selecting the lengths of every dynamic VGT member throughout the robot arm’s
trajectory. For the purposes of these applications, the length control is left to the local,
on-board control schema for the actuator.
The VGT has appeared in a number of forms recently [3] [5] [6] [1], though mostly the
structure has been implemented as a sort of robot manipulator arm. The structure is
typically a series conglomerate of smaller VGT units. Figure 1 below shows a binary
configuration, where the actuated linkages can be in one of two states (in this case, open
or closed). From [2], complex joints can be constructed which allow the manipulators to
perform similar to a free-rotating ball-joint model. Other simplifications can be made in
the actual design, making the reality closer to the simulated VGT.
Figure 1: From [3], a 3 Degree of Freedom binary actuated VGT
Problem
The central problem with the neural network (NN) based approach to robotics control is
that of the time- and processor-intensive training: i.e. the initialization costs are quite
high. Developing a complex architecture, followed by training and evaluating the
network has very high computing costs associated with it. Real-time application of the
neural network is typically fast enough once it has been adequately trained. For a VGT
with n units each with k actuators (that is, k degrees of freedom) per unit, the joint space
is the set of d variables (typically in Cartesian or polar coordinates) that describe the
positioning of the VGT assembly and is represented as the transformation from the Rk
actuator-space to the Rd vector-space for the entire set of n unit configuration. A single
layer neural network can perform the forward calculations in n*k2*d multiplies.
Evolutionary Strategies (ES) have been applied to the joint and actuator-spaces of VGT’s
Page 3 of 12
[4], but they, too, incur high computational costs in generating tables of goal-actuator
relations [5], end-effector densities [3], or other inverse kinematic tables. End-effector
densities are the hyper-dimensional functions describing both the existence of solutions
(i.e. configurations that reach a goal point) for a given VGT system and from what
vectors the goal can be reached. The principle concern for the evolutionary VGT training
is that of simulating the system’s kinematics in either hardware or software. The design
must either be experimentally tested or kinematically solved by pre-implementation
processing. This is the same hurdle posed by neural network implementation.
An example of this is seen in Figure 2, taken from [5]. Since the search space is reduced
to a single plane, and then revolved about the center axis, the problem is being simplified
by exploitation of the tank symmetry. The figure shows how the evolved vectors were in
the x-z plane, and thus only required evolution in two coordinates. The major drawback
here is that errors are introduced that cannot be overcome via evolution as seen in Figure
3. The control resolution of the arm rotation unit introduces a region between evolved
planes that cannot be gained.
Therefore, our goal is to develop a strategy that doesn’t “cheat” by exploiting symmetry.
Rather, it will learn small
regions in space similarly to
[5], but it will continue
evolving to gain the entire
continuous space. This will
be achieved by “aiming” the
Inaccessible
point
d
Figure 2: From [5], learned space for serpentine VGT
manipulator arm, marked as red grid plane.
Figure 3: Example of error from
simplification.
unsuccessful VGT end-effector at the goal and
having each successive VGT “Follow the
Leader.”
Solution
The solutions to these problems lie in combining the flexibility of a single, small-scale
neural network and the expandability of evolutionary strategies. A neural network
architecture is developed for transforming the goal vector of the end-effector of the VGT
to the manipulator states. Next, the spline curve for the proposed path of the manipulator
Page 4 of 12
Y
X
is found, yielding the coordinates of the end
effector as it moves tangentially along the
path. This path planning is typically done in
preprocessing, but in our case, only a new
goal point is needed (not the entire path)
since we are assuming an initial, extended
position. The manipulator arm VGT’s follow
the time-varying coordinate set for each
actuator as it is propagated from the source.
The evolutionary motion is applied when the
arm is extended and transverse motion (non“follow the leader”) must be made. For the
purposes of this experiment, we assume that
the VGT manipulator is fully extended and
Figure 6: Final VGT configuration after
that any obstacles have already been avoided.
growth (re-extension)
Thus, we need only create a series of
manipulations that move the arm from one
location to another. Our design is simplified by the following assumptions: the truss
performs no transverse motion, the manipulators are simplified to unit-length vectors tied
end-to-end, and the evolution is performed over few (three) variables. This leads to the
concept of the “Follow the Leader” VGT configuration.
Follow the Leader is demonstrated in the following 3 figures. In Figure 4, the VGT is
extended to a certain location in space. By withdrawing two of the “links” into the origin
as seen in Figure 5 and configuring the remaining VGT lengths to similar deflections as
the lower three had been, the manipulator appears to have been clipped at the end. Then,
by pushing new links into VGT from the origin in Figure 6, the system appears to “grow”
towards the goal. This configuration needs only two paths to be clear of obstacles and
requires new VGT deflections to be calculated only for the end manipulator (since
“follower” units simply take on the existing coordinate values). The following
description illustrates how the program evolves the VGT.
INITIALIZATION
Y
Y
X
X
Goal
Figure 5: Intermediate VGT position
Figure 4: Follow the Leader initial
Page 5 of 12
after deletion (retraction)
position and end-effector goal
An initial position is given to the VGT program via a parameter file. This initial position
is constant for the series of runs performed in this experiment.
DELETION
Shortening the list of VGT configuration vectors performs VGT deletes. That is, since
the units take on the characteristic vectors of the lower units upon a delete command (the
lowest, or origin, unit being eliminated), the program memory vector simply is resized for
fewer components.
PATH PLANNING
Two types of path planning were implemented. The first was a deterministic method
based on the maximum VGT deflection angle max. The angle was found by finding the
dot product of the end effector vector and the error vector (Goal - VGT), and then passed
through a hyperbolic tangent transfer function to find the next VGT unit’s relative
deflection angle. The end result is that the next growth of the VGT “aims” towards the
goal.
The second path planning method was performed via a third evolutionary variable. The
change in error before and after a growth function was factored into a normal distribution
random variable for  with mean of 0 and standard deviation of difference*sigma_factor,
where sigma_factor was the evolutionary variable. For a decrease in 1, the vector was
pointed in the correct direction and thus no change was made. This method is much
more random and requires substantially more evolution.
GROWTH
In order to keep the units relative in dimension, the new end effector vector was found
my multiplying the rotation matrix
[cos  -sin ;
sin  cos ]
by the existing end vector. This matrix is only valid in two dimensions and would need
to be re-evaluated for higher dimensions. After finding the new vector, it is “pushed”
onto the VGT vector in memory, simulating the growth.
At this point, it is good to point out that this analysis is designed assuming discrete
modeling in a rough sense. That is, no consideration is given to partial or intermediate
VGT unit steps. As the VGT is retracted or grown, partial elements are present and thus,
interpolation is needed. Assuming that linear interpolation is sufficient, we can continue
our analysis without further consideration at this time.
Page 6 of 12
Experimental Methodology
The driver program for this system initializes the VGT from a parameter file. This file
supplies the following information: Initial VGT size, Maximum number of Deletes,
Maximum deflection angle, population size of trial solutions, Goal coordinates (X, Y),
Goal-error tolerance, evolutionary sigma factor, mutation rate, crossover rate, and initial
VGT configuration vector (See Figure 7 at right). After initialization, the VGT
population is “Grown” as in [1]
Population
and the results are tallied to find
any solutions that may have met
Selection Solutions
the tolerance criteria. After this,
the experiment enters a loop
Mutation
where the population is evolved
Crossover
and re-grown until a preset
number of solutions have been
found.
Initializer
Evolution is performed over
three variables: number of
deletes, theta-phi transfer
Figure 7: Experimental memory model
function factor, and the
stochastic growth sigma factor.
The use of number of deletes is self-explanatory. The theta-phi factor determines the
slope of the theta-phi relationship (where theta is the angle between end vector and error
vector and phi is the VGT end unit’s final allowable angle). The stochastic growth sigma
determines the variance in mutated phi’s for the stochastic growth function. The
evolution is performed in the following manner. Solutions that fall within the given
tolerance are popped from the population vector into a separate solution vector. Since
this pool of good solutions is likely to have a choice selection of parameters, we
uniformly select from these solutions one parent with which to crossover with a
uniformly selected member of the remaining population of non-solutions. This keeps the
solution from being eliminated, but allows the poorer solutions to contribute genetic
information to future evolutions.
A set of obstacles was implemented at this point. Every time the VGT grows, the end
effector is checked for nearness to the obstacle points. If the effector is within 1 inch of
an obstacle, its cost goes up by 100 points, which is two orders of magnitude greater than
the cost of both deletes and growths. This becomes important during evolution because
the cost determines solution fitness. For the stochastic growth method, the growth
variance would be increased if an obstacle were being approached and decreased if an
obstacle is being missed. In these experiments, obstacle points were placed at [10, 0], [0,
5], and [-1, -4].
Page 7 of 12
Experimental Results
After initial tests, we found the stochastic growth to be prohibitively slow, and thus was
not implemented at this time. The real time application of the ES with deterministic
growth can be seen from our experimental results. Some examples are presented below.
For each “run,” the VGT is initialized randomly, grown, and then evolved until 5 or more
solutions were collected. For all of these results, 75 “generations” or runs were
implemented.
This VGT is given an initial position that is “wound up” in a square as [1, 0] [0, 1] [-1, 0]
[0, -1]. Solutions may “unwind” to any point on the square by deleting elements. A
somewhat trivial goal of (1, 8) is made, which is in-line with one of the square’s edge.
As seen in Figure 8, from 4 original units forming a square, this VGT solution was grown
by curling around to reach the goal.
vec_x
1
0
-1
0
0.782724
0.992717
0.727928
0.354149
0.066774
-0.11699
-0.2298
-0.30045
-0.34722
-0.38152
-0.41302
vec_y
0
1
0
-1
-0.62237
0.120471
0.685653
0.935189
0.997768
0.993133
0.973238
0.953798
0.937783
0.92436
0.910721
Sum_x
1
1
0
0
0.782724
1.775441
2.503369
2.857518
2.924292
2.807305
2.577505
2.277057
1.929836
1.548314
1.135292
Sum_y
0
1
1
0
-0.62237
-0.5019
0.183755
1.118944
2.116712
3.109845
4.083083
5.036881
5.974664
6.899024
7.809745
8
7
6
5
4
3
2
1
Figure 8: Sample solution of VGT starting in a
wrapped-up square configuration, unwinding and regrowing to reach goal (1,8) within .25 inches.
0
0
-1
Page 8 of 12
1
2
3
4
After 24 runs of the experiment, the best of each of the runs had a score of 34 exactly,
matching the more direct solution seen in Figure 9. The average run-time for one
evolutionary cycle (evolution performed until 5 solutions have been found) was .175
seconds, with the first 6 cycles being run in under .1 seconds. The average number of
deletes performed was 7,
9
meaning that fully
vec_x
vec_y
Sum_x Sum_y
unwinding the VGT was
1
0
1
0
8
typically too costly and
0
1
1
1
thus less than half of the
-1
0
0
1
7
possible deletes were
0
-1
0
0
executed.
1
0
1
0
6
0
1
1
1
It is good to note that
0
1
1
2
this ES is comprised of a
0
1
1
3
5
series of short-lived
0
1
1
4
populations. Since
0
1
1
5
4
fitness plays a part only
0
1
1
6
until the preset number
0
1
1
7
3
of solutions has been
0
1
1
8
found, the exceptionally
2
fit population members
do not have much
Figure 9: Sample final solution of VGT
1
chance to push out lessstarting in a wrapped-up square
fit population members
configuration.
0
through evolutionary
1
0
competition. Therefore,
this design has not seen
many premature convergences or genetic uniformity when it finishes a run. However, the
program must progressively call the random() function, making this method more like a
random search.
Page 9 of 12
2
A second goal of (12, 8) was implemented. Figures 10 and 11 below describe two
sample solutions. The difference in transfer factor can be seen in the radius in which the
VGT appears to bend.
The maximum radius
10
allowed was limited to
/4.
8
For this set of runs, the
average solution time was
.29 seconds, with the first
8 runs being completed in
less than .1 seconds. The
minimum cost was
46.0713, which solution
implemented a transfer
function of 0.984967.
6
4
2
0
0
2
4
6
8
10
12
14
-2
-4
Figure 10: Sample VGT solution for goal (12, 8) with transfer
factor of 0.29937
9
8
7
6
5
4
3
2
1
0
0
2
4
6
8
10
12
14
-1
Figure 11: Sample VGT solution for goal (12, 8) with transfer
factor of 0.970233
Page 10 of 12
Conclusion
From these experiments, the VGT can be successfully retracted and grown to move the
extended system from one configuration to another within a specified tolerance of a goal
end-point. The VGT evolved within an acceptable amount of time to be implemented in
real time applications. Many different solutions were found, and obstacles were avoided
by selecting solutions that missed them.
Areas of future research include the use of stochastic growth methods, stochastic and
deterministic obstacle avoidance growth, and development of interpolation methods.
Currently, there is a substantial body of publications on this in an experimental sense, but
few on practical, physical construction and implementation. This is another area in
which to research further.
Page 11 of 12
References
[1] A hyper-redundant manipulator
Chirikjian, G.S.; Burdick, J.W.
IEEE Robotics & Automation Magazine , Volume: 1 Issue: 4 , Dec. 1994
Page(s): 22 –29
[2] A novel concentric multilink spherical joint with parallel robotics applications
Hamlin, G.J.; Sanderson, A.C.
Robotics and Automation, 1994. Proceedings., 1994 IEEE International Conference on ,
1994
Page(s): 1267 -1272 vol.2
[3] Computational Issues in the Planning and Kinematics of Binary Robots
Lichter, Matthew D.; Sujan, Vivek A.; Dubowsky, Steven
[4] Simultaneous Optimization of Actuator Placement and Structural Parameters
by Mathematical and Genetic Optimization Algorithms
Locatelli, G.; Langer, H.; Müller, M.; Baier, H.
Institute of Lightweight Structures
Aerospace Department, Technische Universität München
85747 Garching, Germany
[5] Design, implementation, and cooperative coevolution of an
Autonomous/teleoperated control system for a serpentine Robotic manipulator
Sofge, Don; Chiang, Gerald
GreyPilgrim Inc.
687-J Lofstrand Lane
Rockville, MD 20850
Email: [email protected]
Ph: (301) 610-6393x207
[6] Fast Estimation of the Kinematics of the Parallel Modules of the Variable
Geometry Truss Manipulators using Neural Networks
Zanganeh, Kourosh E.; Hughes, Peter C.
Institute for Aerospace Studies
University of Toronto
[7] Design of Double-Octrahedral VGT Manipulators
VDI BERICHTE 1427 "Neue Maschinekonzepte Mit Parallelen Strukturen Für
Handhabung Und Produktion"
Dipl. Ing. O. G. Jakobsen
Dipl. Ing. S. A. Larsen
M. Sc. A. S. Sørensen
M. Sc. N. J. Jacobsen
Page 12 of 12