Finding Time-Optimal Trajectories for the Resonating Arm

Delft University of Technology
Faculty Mechanical, Maritime and Materials Engineering
Master of Science Thesis
Finding Time-Optimal Trajectories for
the Resonating Arm using the RRT*
Algorithm
by
Jan Willem Loeve
Supervisor: Dr. ir. Martijn Wisse
Delft, August 2012
Abstract
At the Delft Biorobotics Lab, research is done on smart robotic arms. The goal of this research is to
design robotic arm that have the design and the skills to perform repetitive tasks in a natural dynamic
manner, because this will lead to more lightweight, energy efficient and safe robotic arms. A first step
within this research is the design of the Resonating Arm; an energy efficient two dof robotic arm. This
robotic arm is designed for pick-and-place tasks and consists of a parallel spring mechanism. Due to the
parallel spring mechanism this robotic arm is more energy efficient than a comparable arm without the
parallel spring mechanism.
At this moment the energy efficient movement patterns are known, but nothing is known about the
time-optimal movement patterns. The time-optimal movement patterns are an important aspect of the
characteristic of the RA. The goal of this thesis is to find the time-optimal trajectories of the Resonating
Arm. The research question that follows from this is:
What are the time-optimal trajectories of the Resonating Arm?
In order to answer this question the RRT* algorithm is used to find these trajectories. The RRT* algorithm is a motion planning algorithm that is asymptotically optimal and is used among others for
manipulation planning of two six dof robotic arms. This algorithm uses a tree structure that is grown
in the state space. The tree consists of vertices (states) and edges (trajectories that connect the states).
The vertices are connected by a so-called steering function which calculates optimal trajectories between
two states. This algorithm will be used first for a linear one dof robotic arm that is comparable to the
one dof RA. After that it is used to find time-optimal trajectories for the one dof RA. Finally simulations
with the two dof RA will be done.
The simulations with the one dof linear system show that the RRT* algorithm finds a trajectory very
close to the time-optimal one within limited time. The time-optimal trajectory for the one dof linear
system is symmetrical and consists of a maximal acceleration and deceleration curve. The time of this
trajectory is 0.3353 [s]. The simulations with the one dof RA are done with a numerical steering function
which relies on an optimization. These simulations take much longer than those of the one dof linear
system for which an analytical steering function was used. Due to the friction in the one dof RA, the
time-optimal trajectory is no longer symmetrical, but it still consists of a maximal acceleration and maximal deceleration curve. The time duration of this trajectory is 0.3290 [s]. So the one dof RA performs
better than the arm without the parallel spring mechanism.
The steering function used for the two dof RA relies on the optimization of a bang-bang control signal
with three switches. Results show that this is a relatively cheap and good way to approximate the
time-optimal trajectories between states of the two dof RA. This optimization is also used to obtain the
time-optimal trajectories for the two dof RA without obstacles in the workspace. This results in a ‘pushmotion’ by which the elbow follows the end effector. Since no better trajectory could be found with other
optimization algorithms it is assumed that this trajectory is at least near time-optimal. This trajectory
of 0.5553 [s] is used as a reference for the simulations with the RRT* algorithm. The RRT* algorithm
performs very badly with the initial settings. Therefore a lot of tuning was needed to obtain reasonable
results. Finally the RRT* algorithm was able to obtain a trajectory within 2.5% of the time-optimal
trajectory. The algorithm with the same settings was applied to the four simulations with an obstacle at
different locations. The resulting trajectories did not seem to be (near) time-optimal, but the simulation
time was too short to make conclusions about the optimality of these trajectories. Thus to find the (near)
time-optimal trajectories of the RA it is needed to improve the performance of the implemented RRT*
algorithm or to run longer simulations.
ii
Contents
Abstract
1 Introduction
1.1 Background and Motivation
1.2 Problem . . . . . . . . . . .
1.3 Approach . . . . . . . . . .
1.4 Outline . . . . . . . . . . .
ii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
2
2
2
4
. . .
Arm
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
6
7
8
3 The RRT* Algorithm
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 The Original RRT* Algorithm . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Working Principles of the RRT* Algorithm . . . . . . . . . . . .
3.2.3 Design of the RRT* Algorithm . . . . . . . . . . . . . . . . . . .
3.2.4 Details of the Implementation of the RRT* Algorithm . . . . . .
3.2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Steering Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.2 Steering Functions used by the Authors of the RRT* Algorithm
3.3.3 Numerical Methods for Optimal Control . . . . . . . . . . . . . .
3.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
10
10
12
14
15
16
16
16
17
17
19
19
4 Simulations and Results for the One Dof Linear System
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Model of the System . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Motor Model . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Equations of Motion . . . . . . . . . . . . . . . . . . . . . . .
4.3 Steering Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Pontryagins Minimum Principle . . . . . . . . . . . . . . . . .
4.3.3 Time-Optimal Trajectories . . . . . . . . . . . . . . . . . . .
4.3.4 The Time-Optimal Trajectory for the One Dof Linear System
4.3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 General Settings of the RRT* Algorithm . . . . . . . . . . . .
4.4.3 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Results of the Simulations . . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.2 Convergence of the RRT* Algorithm . . . . . . . . . . . . . .
4.5.3 Effect of Cutoff Distance . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
22
22
22
22
23
24
24
24
26
28
28
29
29
29
29
31
31
31
31
33
2 The
2.1
2.2
2.3
2.4
.
.
.
.
Resonating Arm
Introduction . . . . . . . . . .
The Design of the Resonating
Parallel Spring Mechanism . .
Conclusion . . . . . . . . . .
.
.
.
.
.
.
.
.
iv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4.6
4.5.4 Effect of Number of Nearest Neighbors . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Simulations and Results for the One Dof Resonating Arm
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Model of the One Dof Resonating Arm . . . . . . . . . . . . .
5.2.1 Friction Model . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Motor Model . . . . . . . . . . . . . . . . . . . . . . .
5.2.3 Equations of Motion . . . . . . . . . . . . . . . . . . .
5.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Steering Function . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
5.3.2 Pontryagins Minimum Principle . . . . . . . . . . . . .
5.3.3 Time-Optimal Trajectory . . . . . . . . . . . . . . . .
5.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Approximated Analytical Steering Function . . . . . . . . . .
5.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
5.4.2 RRT* Algorithm . . . . . . . . . . . . . . . . . . . . .
5.4.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Numerical Steering Function . . . . . . . . . . . . . . . . . .
5.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
5.5.2 RRT* algorithm . . . . . . . . . . . . . . . . . . . . .
5.5.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . .
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
34
35
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
36
36
36
37
37
37
38
38
38
40
40
41
41
41
42
43
44
44
44
44
45
45
45
46
6 Simulations and Results for the Two Dof Resonating Arm
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Model of System . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 Friction Model . . . . . . . . . . . . . . . . . . . . . . .
6.2.2 Motor Model . . . . . . . . . . . . . . . . . . . . . . . .
6.2.3 Equations of Motion . . . . . . . . . . . . . . . . . . . .
6.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Steering Function . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
6.3.2 Numerical Steering Function . . . . . . . . . . . . . . .
6.3.3 (Near) Time-Optimal Trajectories . . . . . . . . . . . .
6.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
6.4 Collision Checking . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.3 Collision Checking for a Single State in the State Space
6.4.4 Collision Checking for a Trajectory in the State Space .
6.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
6.5 Standard RRT* Algorithm . . . . . . . . . . . . . . . . . . . .
6.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
6.5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .
6.6 Greedy RRT* Algorithm . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
48
48
48
48
48
49
50
51
51
51
52
55
55
55
55
55
57
57
57
57
57
59
61
61
v
6.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7 Adjusting the Steering Function . . . . . . . . . . . . . . . . . . . . . . . .
6.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8 Adjusting the Distance Metric . . . . . . . . . . . . . . . . . . . . . . . . .
6.8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9 Final Simulation with the RRT* Algorithm with Improved Pseudometric
6.9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61
61
62
63
63
63
64
68
70
71
71
72
74
76
76
76
76
78
85
85
7 Conclusion and Recommendations
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
88
89
A Parallel Spring Mechanism: derivation of relation
92
Bibliography
96
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Introduction
1.1
Background and Motivation
At the Delft Biorobotics Lab, research is done on smart robotic arms. This takes place within a project
that is funded by a Vidi grant of the NWO. The goal of this research project is stated as:
“..to create robot arms that have the design and the skills to perform their repetitive tasks
in a natural dynamic manner. This will make them more lightweight, more efficient, and
safer than current industrial robot arms, allowing application outside the traditional factory
environment into the domains of food handling and safe human-robot collaboration.”
The approach followed in the project that is funded by the Vidi grant consists of two parts. The first is
focusing on the optimal mechanical design of robotic arms for repetitive tasks. Already a big step has
been taken by the design of the Resonating Arm [34]. This robotic arm makes use of a smart spring
mechanism to move efficiently between two locations. These locations are typically the locations of a
pick-and-place movement by means of which an object has to be picked up from one location and has to
be placed at another location.
The second part of the approach is optimizing the movement of the Resonating Arm (RA) in such a way
that it performs its task in an optimal way. The optimization criterion could be energy use, or, as is the
case in this thesis, the time of the motion.
1.2
Problem
At the start of this master thesis, nothing was known about the time-optimal movement patterns of the
RA. The question about what optimal movement patterns look like becomes increasingly interesting and
complicated when the RA has to avoid obstacles. There is no knowledge about what these time-optimal
movements look like.
Therefore it is the goal of this research to find the time-optimal joint trajectories of the RA for the
pick-and-place task it was designed for. The research question that follows from this is:
What are the time-optimal trajectories of the Resonating Arm?
1.3
Approach
Algorithm
In the case that there are no obstacles present in the workspace of the RA, the time-optimal trajectory
(or feed-forward control signals) can be calculated by means of a straightforward optimization. When
obstacles, like the sides of a box, are presented that should be avoided, the calculation of the time-optimal
trajectories becomes more complicated. Then a so-called motion planning problem [23] arises. From the
last decades on, a lot of research has been devoted to this subject [1, 7, 16, 21, 22, 24, 25, 27, 38].
2
Despite this, no methods capable of optimal motion planning that take kinematic and dynamic constraints
into account existed, until in 2010 the RRT* (Rapidly-exploring Random Tree Star) algorithm was
introduced by Karaman and Frazzoli [18, 20]. This algorithm is asymptotically optimal and is capable of
so-called kinodynamic motion planning by which both kinematic and dynamic constraints are taken into
account. Kinematic constraints ensure that there is no (self)collision, while dynamic constraints limit
the maximum velocity and acceleration. The RRT* algorithm does not use a discretization of the state
space, which makes it suitable for high degrees of freedom systems. Because no discretization of control
is needed, a real optimal solution can be found instead of a resolution optimal solution. Next to this
algorithm from the field of motion planning, some more general algorithms/methods could be used to
find time-optimal trajectories. Examples of such methods are:
• Dynamic programming [2]: a method for solving complex problems by breaking them down into
smaller sub-problems.
• Reinforcement learning [40]: a method that learns for the whole state space of a system what kind
of action should be taken in order to maximize a reward.
• Branch and bound [35]: a method that solves various optimization problems by enumerating all
candidate solutions and discarding large subsets of candidates based on lower and upper estimated
bounds of quantity that are being optimized.
The downside of these general purpose methods is that they should be altered in order to be useful for
a motion planning problem and it cannot be said how good they will perform on such a problem. So
either the RRT* algorithm or one of these methods listed above should be used. In this thesis, the RRT*
algorithm is chosen, since it is from the field of motion planning and it seems very promising because it
is used for manipulation planning of two six degree of freedom (dof) robotic arms. Other reasons why
the RRT* algorithm is chosen are:
• The RRT* algorithm is capable of finding the real optimal solution since no discretization is needed
which would result in a resolution optimal solution.
• The RRT* algorithm is available within ROS (Robot Operating System) [36] which is also used to
control the RA. Therefore it should be relatively simple to use the RRT* for the real RA.
Steering Function
In order to work correctly, the RRT* algorithm needs a so-called steering function. A steering function
is a function that returns an optimal trajectory between two points in state-space of the robotic arm.
The articles about the RRT* algorithm that were published before the start of this research were about
systems with optimization criteria for which an analytical steering function existed.
In general however, finding a steering function comes down to solving a two-point boundary value problem. For some simple systems like a double integrator or a Dubins car [9], an analytical solution to this
boundary value problem exists, but for most of the dynamical systems, no analytical solution exists [23].
This is also the case for the RA, since it is a highly non-linear system. Therefore the main challenge in
this research will be finding or constructing a steering function for the RA.
During the realization of this thesis, a new article was published [15] which describes an optimization
used as a steering function. This article showed that for more complex dynamical systems it is necessary
to rely on a numerical way of solving the boundary value problem. This obviously costs more time than
an analytical steering function. This however should not be a problem, because the goal of this research
is not to find these time-optimal trajectories on-line. Instead,off-line simulations with MATLAB [30] will
be used to find these time-optimal trajectories. It is therefore not a problem if the calculations of the
trajectories cost a significant amount of time.
3
1.4
Outline
In order to answer the research question, Chapter 2 discusses the characteristics of the Resonating Arm.
In Chapter 3 it is discussed how the RRT* algorithm works and what the characteristics of the algorithm
are. Chapter 4 describes the simulations that are done with this algorithm for a one dof linear system.
The simulations with the one dof RA are discussed in Chapter 5. Chapter 6 deals with the simulations of
the two dof RA and it also includes the answer to the research question. Chapter 7 will give an overview
of the previous chapters and will discuss and conclude the results of this Master thesis. This chapter will
also give recommendations for future work.
4
Chapter 2
The Resonating Arm
2.1
Introduction
The time-optimal motion planning that is presented in this thesis will be performed on a novel ‘Resonating
Arm’. The Resonating Arm (RA), designed by M.C. Plooij [34], is designed to be energetically efficient
for pick-and-place tasks. This is achieved by the use of a parallel spring mechanism with a special
characteristic that will be discussed later on. It is proved that the Resonating Arm is more energy
efficient with the parallel spring mechanism than without it, but no tests are performed on the timeoptimal performance. This is the topic of this research as was described in the introduction. Therefore
the properties and the model of the RA need to be discussed. In Section 2.2 it is explained how the RA
arm is constructed and how it works. The parallel spring mechanism, which is part of the RA, will be
the topic of Section 2.3. This chapter will end with a short summary.
2.2
The Design of the Resonating Arm
To understand what the time-optimal trajectories of the Resonating Arm look like, it is important to
have a good understanding of how this robotic arm works. Therefore this section deals with the design
of the RA. It is a two dof robotic arm that can move in the horizontal (x-y) plane. At this moment the
RA does not have an end-effector, but it is designed so that an end-effector can be added easily. Such an
end-effector should be able to move in the z-direction (vertically). In that way the end-effector can be
positioned in a 3D workspace. The RA is designed to be energy efficient for pick-and-place tasks. This is
done by the use of a parallel spring mechanism which will be discussed later. Results of M.C. Plooij [34]
show that the RA saves up to 22 % of the energy compared to the same robotic arm without the parallel
spring mechanism.
a)
b)
Figure 2.1: a)Photo of the Resonating Arm [34], b) Schematic Drawing of the Resonating Arm [33],
6
In Figure 2.1 both a photo and a schematic drawing of the RA are included. This figure shows that the
arm consists of an upper- and lower arm. The lower arm is actuated by a DC motor in combination with
a timing belt. The upper arm is connected to another DC motor by a parallel spring mechanism. The
details of the parallel spring mechanism are discussed in the next section. For both the lower and upper
arm a Maxon 60W RE30 motor is used. The specifications of this motor can be found in Section 4.2.1.
2.3
Parallel Spring Mechanism
The parallel spring mechanism is what makes this robotic arm different from ‘normal’ two dof robotic
arms. This parallel spring mechanism influences the equations of motion of the Resonating Arm, which
will have an effect on the time-optimal trajectories. Therefore an understanding of the behavior of this
mechanism is needed. Figure 2.2 shows two schematic drawings of the parallel spring mechanism. The
lower arm is omitted in this figure, since it is not influenced by the spring mechanism and only makes
the figure less clear. Figure 2.2a presents the principal design, which is more simple but in principle the
same as the drawing of Figure 2.2b, which shows how the spring mechanism is implemented on the real RA.
The mechanism of Figure 2.2a contains a large pulley which is rigidly connected to the upper arm. This
pulley is connected to a smaller pulley by a timing belt and both of these two pulleys are also connected
to each other by an extension spring. The spring is connected parallel to the timing belt transmission
in this way, which is the reason why this is a parallel spring mechanism. The difference between the
two drawings is that in Figure 2.2b the DC motor is not connected directly to the large pulley but to
another pulley that is also used to connect the small and the large pulley. Since Figure 2.2a is more
straightforward, this drawing will be used to model the system. This can be done because of the fact that
the inertia of the spring mechanism is neglected, as it is very small compared to that of the arm itself.
The only thing that has to be taken into account, is that there is an extra transmission ratio between
the DC motor and the large pulley, this ratio is 3:1 and this should be used later on when the equations
of motion for the RA are used.
a)
b)
Figure 2.2: Schematic drawing of the spring mechanism [34]
The design of the parallel spring mechanism is discussed, but the behavior of this mechanism has not
been discussed so far. In order to understand the behavior it is needed to derive the relation between
the torque and the angle of this mechanism. The details of this derivation are given in Appendix A. The
relation between torque and angle originally was derived by M.C. Plooij [34], but it was not totally clear
how this derivation was done, therefore this is done again in Appendix A. The resulting characteristic
are given in Figure 2.3. These characteristic determine the behavior of the mechanism and are therefore
important to understand.
7
a)
b)
Figure 2.3: a) Potential Energy and b) Torque Characteristic of the Spring Mechanism. The potential
energy characteristic shows three stable equilibrium points (the green dots), at these points the torque
is zero and when no external torque is applied the mechanism will end up at one of these equilibrium
points.
The potential energy in the system with respect to the angle is given in Figure 2.3a. This characteristic
shows three stable equilibrium points at 0, -0.88 and 0.88 [rad] (green dots) and two unstable equilibrium
points at -0.7 and 0.7 [rad] (red dots). At these points the torque is zero which is shown by the torque
characteristic in Figure 2.3b.
Originally, this mechanism was designed to have two flat regions in the potential energy graph, around
-0.8 and 0.8 [rad]. These regions were meant to be the pick-and-place regions, since the torque would
be zero at these regions so the arm could be held still and moved within this region, virtually without
using any energy. Due to a mismatch between the design and the prototype these regions are not flat,
therefore a small torque is needed to hold the mechanism still and to move within those regions. The pickand-place locations that will be used in this thesis are the stable equilibrium points at -0.88 and 0.88 [rad].
The reason this mechanism is suitable for pick-and-place tasks is the shape of the potential energy and
torque characteristic. The relatively flat curve around the pick-and-place locations was already mentioned,
next to that the steep dip in the middle of the potential energy characteristic is also beneficial. This
steep dip corresponds to a large acceleration and a large deceleration -and thus a high velocity- when
moving from one of the pick-and-place locations to the other.
2.4
Conclusion
The Resonating Arm is a two dof arm that can move in the plane. In future an end effector and an
extra dof will be added to the arm so that it can perform pick-and-place tasks. The RA has a novel
parallel spring mechanism that actuates the upper arm. This parallel spring mechanism has a special
characteristic which perfectly suits a pick-and-place task. The design of the RA was discussed because
it influences the equation of motion of the RA, what influences the time-optimal movement patterns of
the arm.
8
Chapter 3
The RRT* Algorithm
3.1
Introduction
In the introduction it was already mentioned that the RRT* algorithm is used to find time-optimal trajectories for the Resonating Arm (RA). The RRT* algorithm is a modified version of the well-known
RRT (Rapidly-exploring Random Tree) algorithm [25]. The RRT algorithm exist over fifteen years and
during this period of time the algorithm is used a lot and numerous modifications to the algorithm have
been proposed. In 2010 Karaman and Frazzoli [18] published their modified version of the RRT algorithm which is capable of asymptotically-optimal motion planning. This RRT* algorithm cannot be used
’off-the-shelf’ so it has to be adjusted to work with the RA. Next to that it should be implemented in
MATLAB [30], since MATLAB is used for the simulations. Most of the work in this thesis will be the
implementation of the algorithm in MATLAB. Therefore it is of great importance to understand how
this algorithm works and this consequently is the subject of this chapter.
In Section 3.2, it is explained how the RRT* algorithm works in general. The steering function is the
part of the algorithm that is the most critical and complex and will be discussed separately from the
other details of the algorithm, in Section 3.3. This chapter ends with a conclusion in Section 3.4.
3.2
3.2.1
The Original RRT* Algorithm
Introduction
This section explains how the RRT* algorithm works. This algorithm finds a feasible trajectory by growing a tree structure in the search space from the start point to the goal point. This growing is done
by adding new vertices to the tree. Figure 3.1 shows what such a tree structure looks like and how it
grows. The start point (or root) of this tree is located at the center of this search space and the goal is
not displayed. When dynamic constraints are not taken into account, the search space is the configuration space which contains only the configuration of the system (only angles in the case of a robotic arm
with rotational joints). If dynamic constraints have to be taken into account, which is the case in this research, the state space is used as search space. Which contains both the angles and the angular velocities.
Figure 3.1: Growth of the Rapidly-exploring Random Tree (RRT) with the root of the tree at the center
of the search space. During the growth process new vertices are added to the three and these vertices are
connected with edges. [25]
The working of the RRT* algorithm will be explained further by means of a simple system. This system
10
is a one link robotic arm that can rotate around a joint (comparable to the one dof Resonating Arm).
The search space is now the state space that contains both the angle and angular velocity. It is the goal
to find the time-optimal trajectory from the start to the goal state, in other words: from the initial angle
and angular velocity to the goal angle and angular velocity.
a)
b)
Figure 3.2: a) State space of one dof rotational arm with angle (θ) and angular velocity (ω). The state
space contains a tree structure from the start (xstart ) to the goal state (xgoal ). The red dots denotes
the vertices of the tree and the black lines are the connections between these vertices. The blue arrows
represent the shortest trajectory along the tree from the start to the goal state. b) An extra vertex (9) is
added to the tree structure which result in a better trajectory from the start to the goal (blue arrows).
An example of how such a trajectory could looks is given in Figure 3.2a. The red dots are the vertices
of the tree and the black lines represent the connections between these vertices (edges). The blue arrows
indicate how the goal vertex can be reached from the start vertex. The figures numbers inside the vertices
represent the order in which they were added to the tree. So in general a tree structure is grown and at
some point the goal is added to the tree. But the adding of vertices continues after the gaol is found so
that the trajectory to the goal can be improved. This is shown in Figure 3.2b where vertex 9 is added
which results in a better trajectory to the goal. In the next section it will be explained in more detail
how this tree structure is grown and how an optimal trajectory is obtained.
11
3.2.2
Working Principles of the RRT* Algorithm
Again, a one link robotic arm is used to explain how the algorithm works. For the sake of simplicity
no obstacles are included. An obstacle would simply mean that a certain region in the state space is a
forbidden region. In this region no vertices and edges are allowed.
Figure 3.3a contains a tree structure that is grown for some time from the start/root vertex (red dot
with number 0). This root vertex is connected to vertices 1, 3 and 4 and is the parent of these vertices.
Consequently, these three vertices are the child vertices of the root vertex. Every vertex has only one
parent, but can have multiple children. In this figure the cost of the connections between the vertices are
also given by the numbers next to the edges. This cost can be anything, but in this research the cost will
be the time that is needed to go from the parent vertex to the child vertex. For the sake of simplicity
the edges are drawn as straight lines.
a)
b)
c)
d)
Figure 3.3: Growing process of the RRT* algorithm (which is similar to that of the RRT algorithm). a)
A tree structure that is grown in the state space from the start state (xstart ) for some time. The tree
contains vertexes (red dots) that are connected through each other by edges (black lines) for which the
costs are given by numbers next to these edges. b) A new state (xnew ) is sampled from the state space
and will be added to the tree. c) The vertex of the tree that is the nearest (xnearest ) to the new sampled
state is selected. d) The new sampled state is connected to the nearest vertex and it this way the tree is
extended with a new vertex (vertex number 9) and edge.
The tree of Figure 3.3a is further extended by first sampling a new state (xnew ) from the state space
12
which is shown in Figure 3.3b. This state can be sampled randomly but also other sampling techniques
can be used. After that the vertex of the tree that is the nearest to the new state is selected (this is
indicated by the blue vertex in Figure 3.3c). This vertex of the tree is called the nearest vertex. In order
to determine the nearest vertex, different distance metrics can be used. After that a trajectory between
the nearest neighbor and the new state has to be calculated. This is done with a so-called steering
function. This steering function returns the optimal trajectory between two states. Optimal trajectories
can be defined in terms of minimal energy use, of minimal power, but in this thesis the optimal trajectories are defined as minimal time. The cost of a trajectory is therefore the time duration of the trajectory.
In Section 3.3 the steering function is discussed in detail. When this trajectory is collision free, then
the new state is added to the tree which is shown in Figure 3.3d. In this way the tree is extended, or
grown, and eventually one branch will reach the goal state. This growth process of the RRT* algorithm is
comparable to that of the basic RRT algorithm. This approach leads to feasible trajectories in the state
space but not necessarily to optimal ones. It is even proved that the RRT algorithm converges almost
surely to a non-optimal trajectory [20]. So some addition to the algorithm is needed to ensure optimal
trajectories. Therefore two optimization steps are needed.
First Optimization Step
The first optimization step is performed after the new vertex is connected to its nearest vertex. It is now
checked whether this new vertex can be connected to the k other nearest neighbors (in this case k = 3).
The connection that leads to the lowest cost to reach the new vertex is the one that remains and the
others are deleted. This is illustrated in Figure 3.4. In Figure 3.4a the tree is plotted with the new vertex
connected to his nearest neighbor. The brown vertices are the three other nearest neighbors. They have
also been tried to be connected to the new vertex and the cost of these connections are plotted along
the edges. The cost to reach the new vertex through the current trajectory (vertices 0-4-9) is 14. But
the new vertex can also be reached at a cost of 11 through vertices 0-1-5-9. As a result of that the edge
between vertex 5 and 9 is maintained and the other edges toward vertex 9 are deleted. This results in
the tree of Figure 3.4b.
a)
b)
Figure 3.4: The first optimization step of the RRT* algorithm: when the new vertex is connected to the
tree it is checked whether this vertex can be reached at lower costs trough one of the k nearest neighbors
(k = 3 in this case). In this case the trajectory along vertices 0-1-5-9 results in lower cost then trajectory
0-4-9. So the original edge between vertex 4 and 9 is deleted and a new edge between 5 and 9 is added.
13
Second Optimization Step
The second modification is that, when a new vertex is added to the tree, it is checked whether vertices
that are already in the tree can be reached at a lower cost through this new vertex. This is also checked
for a number of k nearest neighbors. In Figure 3.5a the tree is plotted again with the new vertex that
has just been added and its k nearest neighbors. For these three vertices the costs of connection to the
new vertex are given. It can be seen that vertex 6 is currently reached at a cost of 15, while trough
the new vertex this vertex can be reached at a cost of 12 (through vertices 0-1-5-9-6). Since this is
lower than the current cost, the current edge toward vertex 6 is deleted and a new edge is added. The
result is the tree of Figure 3.5b. When a new edge is added to the tree a consequence is that the cost to
reach the vertex that is rewired should be updated as well as the costs of all the child vertices of this vertex.
a)
b)
Figure 3.5: The second optimization step of the RRT* algorithm: vertices that are already in the tree
are connected through the new vertex if this results in a lower cost to reach these vertices. In this case
vertex 6 is rewired (the previous edge to this vertex is deleted and a new edge to this vertex is added to
the tree).
3.2.3
Design of the RRT* Algorithm
How the algorithm in principle works was explained here above. The pseudocode of the RRT* algorithm
is given in Algorithm 1. The pseudocode of the RRT* algorithm shows that the tree is initialized (line
1-2) and then the ‘tree-growing’ loop is entered (line 3-6). In this loop a state is sampled and the extend
function is called. Algorithm 2 contains the pseudocode of the Extend function.
This function determines the nearest vertex (line 1) and the steering function returns the optimal trajectory between the nearest vertex and the sample (line 2). If this trajectory is collision free (successful), the
tree optimization loop is entered (line 3-24). Inside this loop first the k nearest neighbors are determined
(line 4) and the sample is added to the tree (line 5). Two variables are set to keep track of the current
parent of the new vertex (line 6) and the cost to reach the new vertex through that parent (line 7). Next,
the first optimization step is performed (line 8-15). In this process it will be checked whether the new
vertex can be reached at lower cost through one of the k nearest neighbors (8-10). The current lowest
cost (line 12) and the corresponding best parent of the new vertex (line 11) are kept. When all k nearest
neighbors are checked, the edge between the new vertex and its best parent is added to the tree (line
15). After that the second optimization step is performed whereby it is checked for all vertices in the
tree (expect the one that is already connected to the new vertex) whether these nodes can be reached at
lower costs (line 16 -23). If vertices can be reached at lower cost (line 17-18), the current edges to these
vertices will be deleted (line 20), and new edges from the new vertex to these vertices will be added (line
14
21). After that, the Extend function returns the new tree.
Algorithm 1 RRT*
1: V ← {xstart }; // the vertices (V) are initialized with the start state
2: E ← Ø; // the edges (E) are initialized empty
3: T ← (V, E);// the tree (T) is initialized with the edges and the vertices
4: for i = 1 to N do
5:
xnew ← Sample(i); // a new state is sampled
6:
T ← Extend(T, xnew ); // the extend function is called
7: end for
Algorithm 2 Extend
1: xnearest ← N earest(T, xnew ); // the nearest vertex of the tree is returned
2: (succes, cost) ← Steer(xnearest , xnew ); // the steering function tries to find a trajectory
3: if succes then
4:
Xnear ← N earestneighbor(T, xnew ); // the nearest neighbors of the new vertex are returned
5:
V ← V ∪ xnew ; // the new vertex is added to the vertices
6:
xmin ← xnearest ; // xmin is the current parent of the new vertex
7:
costmin ← cost; // costmin are the current cost to reach the new vertex
8:
for all xnear ∈ Xnear do
9:
(succes, cost) ← Steer(xnear , xnew ); // the steering function tries to find a trajectory
10:
if succes and cost < costmin then
11:
xmin ← xnear ; // xmin is changed since a better parent is found
12:
costmin ← cost; // costmin is changed since the new vertex can reached at lower costs
13:
end if
14:
end for
15:
E ← E ∪ {xmin , xnew }; // the new edge toward the new vertex is added to the edges
16:
for all xnear ∈ Xnear \{xmin } do
17:
(succes, cost) ← Steer(xnew , xnear ); // the steering function tries to find a trajectory
18:
if succes and cost < cost(xnear ) then
19:
xparent ← P arent(xnear ); // xparent is the current parent of xnear
20:
E ← E\{xparent , xnear }; // the edge between xnear and its parent are deleted
21:
E ← E ∪ {xnew , xnear }; // a new edge between xnear and xnew is added to the edges
22:
end if
23:
end for
24: end if
3.2.4
Details of the Implementation of the RRT* Algorithm
In order to understand how the RRT* algorithm will be implemented for the Resonating Arm, it is
important to understand the details of the algorithm. Therefore these will be discussed in this section.
The steering function, as already mentioned, will be discussed separately in Section 3.3.
Sampling
The sampling procedure is used to obtain a sample from the search space that can be used as a new
vertex of the tree, if it can be connected to the tree. This sample should be taken from the search space
(the sampling region). In the case of a robotic arm with rotational joints, this sampling region is the state
space of the robotic arm. This region is bounded by joint limits and the maximum angular velocities.
A sample should be taken from this sampling region. This can be done randomly, but also a number of
special sampling techniques exist that select the samples in a more special way [4, 5, 26,43]. It is common
15
practice to use the goal state as a new sample in a certain percentage of times. This should only be done
when the goal is not yet included in the tree.
Nearest Neighbors
When a new sample is taken, the nearest neighbors of this state should be determined. Basically two
approaches exist to determine the k nearest neighbors. The first approach is most common: the distance
to each vertex in the tree is calculated and then the vertices get sorted based on this distance. The second
approach can be useful when the tree consists of a great amount of vertices. In this case, an approximate
nearest neighbor method can be used. Such a method does not calculate the distance to each vertex but
uses special techniques to approximate the k nearest neighbors. The first approach is the one used most
often.
In order to calculate the distance to the vertices of the tree, a distance metric is needed. Such a distance
metric should represent the length of the trajectory that is needed to go from one state to another. A
very basic distance metric is the euclidean distance in state space, or a scaled euclidean distance in which
a scaling is used for the angular velocities or between the different dof. For relatively simple systems
such a distance metric can be a good representation of the cost-to-reach, but for more complex systems
it can be very challenging to create a good distance metric. In general the performance of the algorithm
is increase substantially when a better distance metric is used [24].
Collision Checking
A collision checker is needed to check if the trajectories returned by the steering function are collision
free. Two types of collision checks have to be performed. The first check is on self collision; this can only
occur in articulated systems. The second check is on collision with obstacles. The collision checks can
be simplified by modeling the robot and the obstacles as simple geometrical volumes. Special techniques
exist to perform collision checks efficiently.
3.2.5
Conclusion
The RRT* algorithm is an asymptotically-optimal version of the basic RRT algorithm which is a wellknown algorithm that exists for fifteen years. The RRT* algorithm grows a tree structure in the search
space and this tree is optimized given a certain cost function. The root of the tree is located at the start
point and the tree is grown toward the goal. The algorithm does not stop when a feasible trajectory to
the goal is found, but it continues optimizing the tree and the trajectory to the goal.
The algorithm consists of a number of sub-functions of which the steering function is the most crucial one.
This function is responsible for returning optimal trajectories between states. Next to that a sampling
function is needed, as well as a nearest neighbor procedure and a collision checker. For the nearest
neighbor procedure a distance metric is needed that represents the length of the trajectories between
states in the search space.
3.3
3.3.1
Steering Function
Introduction
Section 3.2 explained that the RRT* algorithm needs a so-called steering function. This is a function
that returns the optimal trajectory between two points in state space. The optimal trajectory can be
defined differently. For example minimum length and minimum energy consumption are possibilities. In
this research the optimal trajectories are minimal time trajectories. The steering function is dependent
on the system since the system determines what control is needed to get from one state to another. In
this section different types of steering function will be discussed as well as different approaches to obtain
a steering function. In this section it is not discussed which steering function is used in this thesis, since
16
this is dependent of the systems for which the simulations will be done and these systems not yet discussed.
The inventors of the RRT* algorithm have written a number of papers in each of which the algorithm
is used for different systems. The steering functions they used for these systems will be discussed first.
Later on more general approaches following from the field of optimal control will be discussed.
3.3.2
Steering Functions used by the Authors of the RRT* Algorithm
The authors of the RRT* algorithm published a number of articles [15, 17–20, 32] in which the RRT*
algorithm is tested on different problems varying from simple systems like a double integrator to a highspeed off-road vehicle. these articles would be a good starting point for selecting an appropriate method
to obtain a steering function.
In the earlier articles of Karaman and Frazzoli about the RRT* algorithm, it is used on relatively simple
systems [17] or on systems that are not discussed in detail [18, 19]. For relative simple systems (Dubins
Car [9], double integrator and flying plane) they use analytical steering functions. The steering function for the Dubins Car returns the shortest path between two states. The steering function for the
double integrator is obtained from optimal control theory [6], with the use of the Pontryagins Minimum
Principle [3]. The steering function for the plane is a combination of both the Dubins vehicle (for the
horizontal motion) and the double integrator (for the vertical motion). For the systems of which the
kinematics and dynamics are not discussed a steering function is used that minimizes the path length
in Euclidean distance. This is done so that an analytical steering function can be used.The article on
manipulation planning for two six dof robotic arms [32] also describes this. The optimization criteria
for this manipulation planning is the path length. The dynamics of the system are not even included,
because this would result in a numerical steering function.
During this thesis a new article about the RRT* algorithm was published [15]. In this article a timeoptimal motion is calculated for an off-road vehicle. The steering function that is used, uses a multipleshooting algorithm [8] with piecewise constant inputs to find optimal trajectories. This is not an ideal
steering function, since the multiple shooting algorithm does not always find a feasible trajectory. Next
to that it does not always find the optimal trajectory if a trajectory is found. The authors of this article
state that it is not sure what the effect of this imperfect steering function is on the asymptotically-optimal
behavior of the algorithm. Thus research has to be done on the effect on the convergence of the RRT*
algorithm. The results for the off-road vehicle show that for this case the RRT* still finds (near) timeoptimal trajectories.
This last article shows that for more complex dynamical systems (of which the RA is one), it can be
necessary to use a numerical steering function instead of an analytical one. That is why in the next
section an overview will be given of the numerical methods for optimal control.
3.3.3
Numerical Methods for Optimal Control
For simple systems analytical solutions to this the optimal control problem exist, but for more complex
dynamical systems, there is need for numerical methods. Figure 3.6 gives an overview of the most common numerical methods that can be used. These methods are; dynamic programming, indirect methods
and direct methods.
Dynamic Programming: this methods breaks up the problem into smaller subproblems. In this way
a feedback controller can be computed recursively using the optimality of subarcs. These subarcs follow
from a tabulation of the state space. Due to this tabulation this method suffers from the curse of dimensionality.
Indirect Methods: In this kind of methods necessary conditions for optimality are derived from the
calculus of variation or from the Pontryagin Minimum Principle [3]. The result of this is a Boundary
17
Figure 3.6: Overview of numerical methods for optimal control [8].
Value Problem (BVP) which can be solved by different techniques. This BVP can be hard to solve
because of the strong non-linearity and instability of the underlying differential equations. However, for
simple systems, an analytical solution to this problem may exist.
Direct Methods: transform the optimal control problem into a Nonlinear Programming (NLP) problem.
This NLP problem can be solved by numerical optimization methods. The most common methods for
this are:
• Single Shooting: in this method the control is discretized, i.e. the control is either piecewise
constant or piecewise linear. First a guess of the control parameters is made and then the equations
of motion are integrated and the end point in state space is compared to the goal point. Based on
this difference a new guess of the control parameters is done. This process continues until a set of
control inputs is found which results in an endpoint sufficiently close to the goal point. Figure 3.7
gives an example of the intermediate result obtained by a multiple shooting algorithm.
• Collocation: contrary to single shooting, the optimization variables in the collocation method are
a parameterization of the state trajectory. The equations of motion are implemented as equality
constraints.
• Multiple Shooting: as with single shooting the control is divided into intervals. But with multiple
shooting the equations of motion are integrated separately on each interval. The initial states
for each interval are guessed. In order to guarantee continuity between the intervals, additional
constraints are needed. There are also constraints needed to guarantee that the initial and the final
state match the prescribed initial and final state. Figure 3.8 gives an example of the intermediate
result obtained by a multiple shooting algorithm.
18
Figure 3.7: During single shooting the equations of motion of a system are integrated for a specific control
input (piecewise constant in this case). The error in the terminal constraint is used to calculate a new
control input and this process is continues until the terminal constraint is met [8].
Figure 3.8: This is the result of a multiple shooting iteration in which the equations of motion of a system
are integrated for every time interval separately, starting from an initial state and for a certain control
input. Since the trajectories for each time interval do not connect to each other, constraints are needed
to guarantee that the trajectories connect to each other [8].
These are in general the numerical methods that can be used to solve the optimal control problem. With
the indirect methods the result is a boundary value problem that should be solved. This problem can
also be solved by one of the direct methods, so a combined approach is also possible.
3.3.4
Conclusion
The steering function finds optimal trajectories between two states in the search space. The performance
of the RRT* algorithm greatly relies on this function. Two kinds of steering functions exist: the analytical and the numerical. For the numerical steering function it is proved that the RRT* algorithm is
asymptotically-optimal. For the numerical steering function this is not the case, but the results suggest
that the trajectories found are also optimal [15].
The numerical steering function can be obtained by different numerical methods. These methods are:
dynamic programming, indirect methods and direct methods. Also a combined approach of an indirect
and direct method can be used.
3.4
Conclusion
The RRT* algorithm is a motion planning algorithm that is capable of finding asymptotically-optimal
trajectories. This is proved for system for which an analytical steering function exist. The algorithm
grows a kind a tree-like structure that is used to search create a feasible trajectory between the start and
goal. The tree structure as well as the trajectory between the start and goal is optimized by adding more
19
vertices to the tree and rewiring vertices. The algorithm relies on a so-called steering function to connect
the vertices of the tree in an optimal way.
The steering function is the most important part of the algorithm since, without a good steering function
the algorithm will fail. Both analytical and numerical steering function exist, but normally the analytical
give a far better performance. Next to steering function also a sampling function, a nearest neighbor procedure and collision checker are needed. The performance of the algorithm relies on all these components.
The algorithm described in this chapter is implemented in MATLAB to use it to find the time-optimal
trajectories of the Resonating Arm.
20
Chapter 4
Simulations and Results for the One
Dof Linear System
4.1
Introduction
In this chapter the RRT* algorithm will be tested on a linear one dof system. An analytical steering
function exists for this system. For a system like this, the simulations should be relatively fast. The tests
will show whether the implementation of the RRT* as used in this research works correctly. The tests
will also check the convergence of the algorithm and evaluate the effect of the cutoff distance and the
number of nearest neighbors used.
The linear one dof system that is used in this chapter is basically the Resonating Arm without the lower
arm and without the spring mechanism. This way, the system is virtually a double integrator with a
known analytical steering function [11, 17]. Chapter 5 shows the time-optimal trajectories for the real
one dof RA. The time-optimal performance of the one dof RA without the spring mechanism can be
compared to the time-optimal performance of the real one dof RA.
The RRT* algorithm used in this chapter is the original RRT* algorithm, as can be read in the previous
chapter. Real collision checking is not necessary, since a one dof system can in no way avoid any obstacles
presented. The collision checking for the one dof system is merely meant to make sure that the maximum
joint angles are not being crossed.
Section 4.2 describes the model of the one dof linear system. The analytical steering function for this
system is derived in 4.3. Section 4.4 describes the method of working for how the simulations are done
and which simulations are done. The results from this method will be presented in Section 4.5, followed
by a conclusion of the chapter in Section 4.6.
4.2
Model of the System
As was described in the previous section, the linear model of the one dof Resonating Arm is used in this
chapter. The steering function of the RRT* algorithm uses the equations of motion following from the
linear model of the one dof RA. Therefore this model needs to be discussed. The linear model of the one
dof RA basically is a link with certain dimensions that is actuated by a DC motor. The friction in the
system is neglected, for it would create a non linear model. The model of the DC motor used, can be
found here beneath, as well as a schematic drawing of the arm and the equations of motion.
4.2.1
Motor Model
A Maxon RE30 DC motor is used for the actuation of the upper arm as was already mentioned in Section
2.2. The simplest way to model this motor is by only the torque that is exerted on the arm. More complex motor models can contain the electric circuit of the motor or even the electronics of the amplifiers
that are used to provide the voltage and current to the motor. So a trade-off has to be made between
complexity and accuracy of the model.
22
The choice is made to incorporate the rotor inertia and the gearbox efficiency into the motor model but
to neglect the motor electronics. Consequently, the influence of the motor on the arm is that it provides
a torque and adds extra inertia. This guarantees that the system remains linear. The motor is connected
to the arm through a gearbox so the maximum torque the motor can exert on the RA can be calculated
with: Tmax = Tmax,motor · n · η. The reflected rotor inertia (effect of the rotor inertia at the joint) can be
calculated with Ir,r = Ir · n2 . The relevant motor parameters can be found in Table 4.1. The extra gear
ratio of 3:1 that is the result of the way the motor is connected to the joint of the upper arm, is already
incorporated in the gear ratio as is listed in the table.
Motor
Parameter
Symbol
1
Gear ratio
Gearbox efficiency
Rotor inertia
Maximum motor torque
Maximum motor torque
at the joint
n1
η1
Ir,1
Tmax,motor,1
Tmax,1
Value
54
0.87
33.5·10−7
0.085
3.99
[-]
[-]
[kg m2 ]
[Nm]
[Nm]
Table 4.1: Relevant motor parameters
4.2.2
Equations of Motion
Parameter
Symbol
Mass
Moment of inertia at com1
Length
Relative location of com
Joint limits
m1
I1
l1
lg1
θ1,max
Value
0.6
0.3
0.4
0.5
[− π3 , π3 ]
[kg]
[kgm2 ]
[m]
[-]
[rad]
Figure 4.1: Model of the one dof linear arm with parameters of the system given in the table.
The model of the linear one dof arm is given in Figure 4.1. Figure 4.1a shows the positive direction of the
torque and angle as well as the labels of the endpoints. The actuator torque is Tm . Figure 4.1b contains
the parameters of the model, which are also listed in the table next to it. This model is used to derive
the equations of motion. The equations of motion are quite straightforward, since the acceleration is the
torque of the link divided by the total inertia of the link at the joint. The resulting equations of motion
are given here beneath, whereby x1 is θ and x2 is ω (the angular velocity).
x˙1
x˙2
= x2
=
(4.1)
Tm,1
m1 · l12 · lg12 + I1 + Ir1 · n21
(4.2)
This model is used to obtain a steering function that can be used to calculate the time-optimal trajectories between two points in the state space. The next section will deal with the obtaining of this steering
1 center
of mass
23
function.
4.3
4.3.1
Steering Function
Introduction
In order to obtain a time-optimal steering function, an optimal control problem has to be solved. In
Section 3.3 it was stated that this optimal control problem can be solved by different methods. These
methods were dynamic programming, indirect methods and direct methods. For the linear system of this
chapter, which is basically a double integrator, an indirect method will be used. The indirect method
applied in this chapter is Pontryagins Minimum Principle (PMP). This results in an analytical solution
to the optimal control problem.
4.3.2
Pontryagins Minimum Principle
The procedure that is needed to solve the optimal control problem with PMP is given in Figure 4.2. So
first the statement of the problem is needed. The plant (or model of the system) was already given in
Section 4.2, but it will be used here in a slightly different form:
x˙1 = x2
u
x˙2 =
Itot
with u = Tm,1 and Itot = m1 · l12 · lg12 + I1 + Ir1 · n21
(4.3)
(4.4)
The performance criterion is time, so the performance index is:
Z tf
Z
J = S(x(tf ), tf ) +
V (x(t), u(t), t)dt =
t0
tf
1dt
(4.5)
t0
and the boundary conditions are: x(t0 ) = x0 and x(tf ) = xf with tf is free.
Step 1
The Hamiltonian is:
H(x(t), u(t), λ(t), t) = 1 + λ1 (t)x2 (t) + λ2 (t)
u(t)
Itot
(4.6)
with λ1 and λ2 two costates
Step 2
The Hamiltonian should be minimized w.r.t. x(t):
min (1 + λ1 (t)x2 (t) + λ2 (t)
|u|≤Tmax
u(t)
) → u∗ (t) = −sign(λ∗2 (t))Tmax
Itot
(4.7)
Step 3
The optimal control u∗ (t) should be plugged back into equation 4.6 to find the optimal Hamiltonian:
H∗ (x∗ (t), u∗ (t), λ∗ (t), t) = 1 + λ∗1 (t)x∗2 (t) −
24
1 ∗
λ (t)sign(λ∗2 (t))Tmax
Itot 2
(4.8)
Figure 4.2: Procedure to solve an optimal control problem with Pontryagin Minimum Principle [31].
25
Step 4
The differential equations that have to be solved are now:
∂H
= x∗2
x˙∗1 (t) =
∂λ∗1
∂H
Tmax
x˙∗2 (t) =
=−
sign(λ∗2 (t))
∂λ∗2
Itot
∂H
λ˙∗1 (t) = − ∗ = 0
∂x1
∂H
λ˙∗2 (t) = − ∗ = −λ∗1 (t)
∂x2
(4.9)
(4.10)
(4.11)
(4.12)
Since λ˙∗1 is zero, λ∗1 is constant, the costates are:
λ∗1 (t) = λ∗1 (0)
λ∗2 (t)
=
λ∗2 (0)
(4.13)
λ∗1 (0)t
+
(4.14)
(4.15)
These equations show that λ∗2 is a straight line. Therefore λ∗2 can only cross the zero line once, which
means the control signal u can only switch once between maximum positive and negative torque.
This is in accordance with theorem 7.3 out of the book Optimal Control Systems [31] which states that
the maximum number of switches for a normal control system is (n - 1) for the situation where all the
eigenvalues of the system are real. Since that is true in this case, the maximum number of switches is
indeed one. This result will be used to derive a time-optimal steering function. The procedure given in
Figure 4.2 will not be used any further, since this will lead to a control rule while a description of the
time-optimal trajectories is needed.
4.3.3
Time-Optimal Trajectories
The result of the previous section was that the maximum number of switches between maximum positive
and negative torque is one. This means that there are four possible time-optimal control sequences: -,
+, - + and + -. In order to find the optimal control for a specific problem, first the maximum positive
and negative torque trajectories in state space have to be calculated. For a given initial state:
x1 (0) = θ0
(4.16)
x2 (0) = ω0
(4.17)
The state equations become:
x1 (t) = θ0 + ω0 t +
1 ∗ 2
u t
2Itot
x2 (t) = ω0 + u∗ t
(4.18)
(4.19)
Now equation 4.19 can be rewritten and plugged into equation 4.18, which results in:
(x2 − ω0 )
u∗
ω2
x2
x1 = θ0 − 0∗ + 2∗
2u
2u
t=
26
(4.20)
(4.21)
The result of this is an equation that describes the forced trajectories for input u∗ . Since u∗ is either
+Tmax or -Tmax this gives two forced trajectories. The forced trajectories for two random initial states
can be seen in Figure 4.3a. The arrows indicate the direction of the trajectories. These trajectories show
how to get from one state to another. In Figure 4.3b the magenta trajectory shows how to get from the
left state to the right by following first the maximum positive torque trajectory from the left state and
then switch to the maximum negative torque trajectory toward the right state. For this trajectory one
switch from maximum positive to maximum negative torque is needed.
Figure 4.3: a) The maximum positive (blue) and negative (green) torque trajectories through state x1 .
b) The maximum positive (blue) and negative (green) torque trajectories through state x1 and x2 . The
trajectory from x1 to x2 is the magenta line where by first the maximum positive torque trajectory from
x1 is followed and than the maximum negative torque trajectory of x2 .
For almost all combinations of states one switch in torque is necessary except for states that are laying
on the same trajectory. When one switch is needed, there are two possible switch patterns: + - and +. The point at which the torque has to be switched can be calculated as follows. First the maximum
positive torque trajectory of the start state is set equal to the maximum negative torque trajectory of
the goal state. This gives equation 4.22. Also for the other possible switch sequence the equations of the
trajectories are equated, which results in equation 4.24.
For first u+ and then u-:
1
I
1
I
1
I
1
I
θ0 − ω02
+ x22
= θ1 + ω12
− x22
2 Tmax
2 Tmax
2 Tmax
2 Tmax
r
1
Tmax
x2 = ± (θ1 − θ0 )
+ (ω02 + ω12 )
I
2
(4.22)
(4.23)
For first u- and then u+:
1
I
1
I
1
I
1
I
θ0 + ω02
− x22
= θ1 − ω12
+ x22
2 Tmax
2 Tmax
2 Tmax
2 Tmax
r
Tmax
1
x2 = ± (θ0 − θ1 )
+ (ω02 + ω12 )
I
2
(4.24)
(4.25)
With x2 is the angular velocity at the switch point. If the solution of x2 has an imaginary part, the
trajectories do not cross each other. Since x2 can both be positive and negative, there are in theory four
27
trajectories possible. These trajectories can be found by plugging x2 into equation 4.20. This gives the
following equations for the time length of the two segments of the + - trajectory:
t1 = (x2 − ω0 )
I
Tmax
I
t2 = (ω1 − x2 )
−Tmax
(4.26)
(4.27)
This equation should be solved for both the negative and the positive x2 . If the result is a positive t1
and t2 , then this is a feasible trajectory. If one of them (or both) are negative, then one (or two) of the
forced trajectories should be followed in the opposite direction. Since this is not possible, this is not a
feasible trajectory. The same procedure can be done for the - + trajectory and this gives a maximum of
four trajectories. The one with the lowest costs (time) is selected.
4.3.4
The Time-Optimal Trajectory for the One Dof Linear System
The result obtained in the previous section was a procedure to obtain time-optimal trajectories between
two states, which is in fact the steering function that will be used in the remainder of this chapter. This
steering function can also be used directly, without the RRT* algorithm, to find the time-optimal trajectory, since no obstacles are included in this system. The analytical time-optimal trajectory for the one
dof linear robotic arm between the start and goal states, which are [-0.88, 0] and [0.88, 0] respectively, is
given in Figure 4.4. This figure shows that the torque is switched from +Tmax to -Tmax when the angle
of the link is 0 [rad]. The total time to reach the goal state is 0.3353 [s].
Figure 4.4: The analytical time-optimal trajectory for the linear one dof arm from [-0.88, 0] to [0.88, 0]
in state space.
4.3.5
Conclusion
The time-optimal control for the one dof linear robotic arm was derived with Pontryagins Minimum
Principle. From this principle followed that the time-optimal control is ±Tmax and this control is
switched maximal one time between -Tmax and +Tmax. This time-optimal control is implemented in
a steering function that calculates the time-optimal trajectory between two states. With this steering
function the time-optimal trajectory between the start and goal state was found. This trajectory is
symmetrical and has a time length of 0.3353 [s]. This result can be used to be compared with the best
trajectory found with the RRT* algorithm.
28
4.4
4.4.1
Method
Introduction
As explained in the introduction of this chapter, the goal of the simulations done in this chapter is to gain
knowledge about the working of the RRT* and to test its implementation possibilities. The trajectory
that should be optimized is the one between the pick- and place locations. The coordinates of these
locations are [-0.88, 0] and [0.88, 0]. These are the two outer stable equilibrium points of the spring
mechanism.
4.4.2
General Settings of the RRT* Algorithm
Before the RRT* algorithm is used for the simulations, first some specific details have to be discussed.
Two modifications to the basic RRT* algorithm are made.
The first modification is the implementation of a so-called cut-off distance. The cutoff distance does not
allow connection attempts between vertexes that are further away from each other than this distance.
This prevents the algorithm from connecting the start and goal state directly with one edge, since this is
possible for a simple problem without obstacles. In that way it is not the RRT* algorithm that finds the
optimal trajectory, but rather the steering function. A cutoff distance (that is smaller than the distance
between the start and goal state) is there to make sure that the working of the RRT* algorithm can be
evaluated.
The second modification is that the sampling procedure does not always return a random new state. For
a certain percentage of the time the goal state is used as a new state. This only happens while the goal
is not yet connected to the tree. This is a common addition which makes the algorithm more greedy:
focuses more on finding a trajectory to the goal, than on exploring the workspace. For workspaces that
are relatively easy, this addition speeds up the process of finding a trajectory to the goal. For 10% of the
time, the goal state is chosen to be used as the new state.
apart from these two modifications, the distance metric that is used is also important. The metric that
is used is the scaled euclidean distance in state space:
q
(4.28)
dis = (θ1,2 − θ1,1 )2 + (0.1(ω1,2 − ω1,1 ))2
Last, the sampling region is also important. The joint limits that limit the sampling region in 1D are
known, but the limits on the angular velocity are also needed. For the sake of simplicity a rectangular
sampling region is used in which the maximum and minimum angular velocity are chosen to be the same
for all angles. The maximum angular velocity during the optimal trajectory was about 10.5 [rad/s], so a
slightly higher maximum angular velocity limit of 12.5 is used. This gives a sampling region of:
π π
Dsample = [(− , ), (−12.5, 12.5)]
3 3
4.4.3
(4.29)
Tests
First some preliminary tests were done to get an idea of how the algorithm works and which are suitable
settings. Then the first logical test is to check whether the algorithm does what it should do and finds
the time-optimal trajectory. The simulations for this simple linear system are probably relatively fast
compared to that of the more complex one and two dof Resonating Arm. For the more complex systems
it is more important to choose a good parameter setting, because in that way better and faster results
will be obtained. This is why the influence of the cutoff distance and the number of nearest neighbors on
the performance of the RRT* algorithm is tested.
29
Convergence of the RRT* Algorithm
The convergence of the optimal solution found is first checked by running the algorithm for a high number
of iterations. The optimal trajectory found is then compared to the analytical time-optimal trajectory.
After that, ten consecutive runs with the algorithm with the same parameter settings are done to check
whether the convergence is the same or this is random.
1. Check if the RRT* algorithm finds the time-optimal trajectory
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0)
(0.88,0)
10000
200
0.5
10%
[(− π3 , π3 ), (−12.5, 12.5)]
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
2. Compare convergence between individual runs
Settings
Value
Start
Goal
# Runs
RRT* algorithm
(-0.88,0)
(0.88,0)
10
5000
200
0.5
10%
[(− π3 , π3 ), (−12.5, 12.5)]
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
Influence of the Cutoff Distance
A small cutoff distance leads, especially in the beginning when the tree does not contain much vertexes,
to less connection attempts. The connection attempts between vertices further away from each other are
normally more likely to fail.This can actually be a way to speed up the algorithm. On the other hand,
the connection of vertexes that are far away can be very valuable for the optimization of the tree. Thus
there is a trade-off between these two effects. The influence of the cutoff distance is checked by doing
simulations for three different cutoff distances. For each cutoff distance ten runs of the simulation are
done to rule out the effects of randomness.
Settings
Value
Start
Goal
# Runs
RRT* algorithm
(-0.88,0)
(0.88,0)
10
5000
200
1/0.5/0.2
10%
[(− π3 , π3 ), (−12.5, 12.5)]
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
30
Influence of Number of Nearest Neighbors
The effect of too little nearest neighbors is a slow convergence, because the tree is not optimized do to
little new connections. Too much nearest neighbors can make the iteration steps very slow. It is therefore
important to investigate this effect. It is tested by running the algorithm for three different number of
nearest neighbors: 100, 200 and 500. Again ten runs of the simulation are done to rule out the effects of
randomness.
Settings
Value
Start
Goal
# Runs
RRT* algorithm
(-0.88,0)
(0.88,0)
10
5000
100/200/500
0.5
10%
[(− π3 , π3 ), (−12.5, 12.5)]
4.4.4
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
Conclusion
A cutoff distance is used in the RRT* algorithm so that the algorithm will not find the time-optimal
right away. When that happens, the working of the RRT* algorithm can not be inspected. The RRT*
algorithm will be tested on three different aspects. First a test will be performed to check if the best
trajectory found by the RRT* algorithm converges to the analytical time-optimal one. It is also checked
if this convergence is comparable between different runs. Next to that a test will be performed to inspect
the performance with different settings of the cutoff distance and the number of nearest neighbors.
4.5
4.5.1
Results of the Simulations
Introduction
In this section the results of the simulations are given. The order of the results is the same as that of
the different tests that were introduced in the previous section. First the results are given that show the
convergence of the best solution found by the RRT* algorithm. After that the effect of the cutoff distance
will be discussed and finally, the results with the different numbers of nearest neighbors are given.
4.5.2
Convergence of the RRT* Algorithm
The convergence of the RRT* algorithm is first inspected by evaluating how the trajectories found by the
RRT* algorithm evolve during the iterations. Figure 4.5 shows that the trajectory after 100 iterations
is quite inefficient with forward and backward motions. The plots however show that the trajectory
approaches the analytical optimal trajectory better when the number of iterations increases. The trajectory found by the RRT* algorithm after 1000 iterations is already quite similar to the analytical optimal
trajectory. The trajectory that is found converges toward the optimal trajectory. The question that is
answered next is how close the trajectory found by the RRT* algorithm approaches the analytical optimal
trajectory.
31
Figure 4.5: Convergence of time-optimal trajectories between the start and goal state found with the
RRT* algorithm.
This is inspected by a simulation of which the result is given in Figure 4.6. This figure shows that the
solution found by the RRT* algorithm converges to 0.3379 [s]. This is a difference of only 0.78 % with
respect to the analytical optimal solution. Thus it can be concluded that the solution found by the RRT*
algorithm converges to the optimal solution within limited time.
Figure 4.6: The time-optimal solution found by the RRT* algorithm (blue line) converges toward the
analytical time-optimal trajectory (red line).
Another important aspect of the convergence is whether it is comparable in different runs. The growth
of the tree is random and therefore big differences between individual runs can occur. Figure 4.7 shows
the spread for ten different runs of the algorithm. Initially the spread is quite substantial, at 100 [s]
the greatest deviation from the average is 3.2 %. When the solution found converges further toward the
analytical optimal, this spread becomes smaller. At 900 [s] the greatest deviation from the average is
only 0.5 %. So the spread between different runs can be quite big initially, but decreases quickly when
the solutions converge toward the optimal solution.
32
Figure 4.7: Spread in convergence between ten independent runs of the RRT* algorithm with the same
settings. The left plot shows the results of the individual runs. The right plot contains boxplots that
show the spread at nine different points.
4.5.3
Effect of Cutoff Distance
The cutoff distance limits the length of the trajectories between individual states. Figure 4.8 shows the
influence of the cutoff distance on the convergence of the RRT* algorithm. These results show that a
greater cutoff distance results in a faster convergence of the best solution found by the algorithm. Thus
in this case a smaller cutoff distance has a negative effect on the convergence.
Figure 4.8: Comparison of the best trajectories found for three different cutoff distances. The left plot
shows ten runs of the RRT* algorithm with a cutoff distance of 0.2. The cutoff distance that belongs to
the middle plot is 0.5 and the cutoff distance used for the simulations of the left plot is 1.0.
33
4.5.4
Effect of Number of Nearest Neighbors
When a new vertex is added to the tree, attempts are made to connect this new vertex to the tree in
a way that keeps the costs as low as possible. For the k nearest neighbors that are within the cutoff
distance is tried which connection results in the lowest costs. When this vertex is added to the tree it is
tried whether the k-th nearest neighbors that are within the cutoff distance can be reached at lower costs
through this new vertex. So the calculation times increase when more nearest neighbors are checked,
but the tree might be optimized faster when more nearest neighbors are checked. Figure 4.9 shows the
average time of the best trajectories found with three different settings of the number of nearest neighbors
(100, 200 and 500). The difference between the average times of the best trajectories found for the three
different settings is very small. The maximum difference is about 0.5%, which is smaller than the spread
between ten runs with the same setting of the nearest neighbors. The performance of the algorithm is
virtually the same for these three settings. At all points in time the algorithm with 200 nearest neighbors
performs the best and the one with 500 nearest neighbors has the second best performance. The only
exception comes in at 50 [s], because there the algorithm with 500 nearest neighbors performs better. 200
nearest neighbors is therefore probably the best setting to use for this system with the current settings
of the RRT* algorithm.
Figure 4.9: Comparison of the average time of the best trajectories at six points in time during the
simulations. This is done for three different numbers of nearest neighbors: 100, 200 and 500.
4.5.5
Conclusion
The results in this section showed that the RRT* algorithm is indeed capable of finding time-optimal
trajectories. The best solution that is found converges to the analytical time-optimal solution. Due to
the random selection of new vertexes there is a spread between the results of individual runs. This spread
is initially in the range of a few percents but decreases when the simulation time increases and the best
trajectories found become closer to the analytical time-optimal trajectory.
The efficiency of the algorithm depends on the cutoff distance. If it is allowed to take greater steps, the
solution converges faster toward the optimal solution and thus less iterations are needed. This is the
case for this system in a workspace in which no obstacles are presented. If there are obstacles, too big
a cutoff distance would probably also have a negative effect. The number of nearest neighbors does not
have much influence on the convergence of the best solution found by the algorithm.
34
4.6
Conclusion
The steering function that was derived in Section 4.3 resulted in a bang-bang control with at most one
switch. This steering function was used ‘stand-alone’ to obtain the analytical time-optimal trajectory
for the one dof linear robotic arm. This resulted in a symmetrical trajectory in which the robotic arm
rotated with maximal acceleration until the 0 [rad] was reached. After that the arm was decelerated
maximum. This resulted in a trajectory of 0.3353 [s].
The aforementioned steering function was also used within the RRT* algorithm, with the only difference being an implemented cutoff distance. Without this cutoff distance the algorithm would find the
time-optimal trajectory immediately, without optimizing the tree structure that is maintained by the
algorithm. Since testing the RRT* algorithm is the goal of this chapter, it is of course necessary to check
how the tree is optimized and how the trajectory is optimized.
The trajectories found using the RRT* algorithm converged toward the optimal one within a couple
of minutes. The performance of the algorithm is certainly good enough to proceed to the simulations
with the (non-linear) one dof Resonating Arm. The code of the algorithm itself was not even optimized, so the calculation time can be improved. However, the algorithm is probably not fast enough to
find the optimal trajectories on line, since in that case the calculation time would be one second maximum.
35
Chapter 5
Simulations and Results for the One
Dof Resonating Arm
5.1
Introduction
The results of the previous chapter were good enough to proceed with the simulations for the (non-linear)
one dof Resonating Arm. The goal of this chapter is to obtain the time-optimal trajectories for this system, so that they can be compared with the system without the spring mechanism. Another goal is to
find a way to obtain a steering function that gives good results for this simple non-linear system in order
to use this steering function for the more complex two dof Resonating Arm.
The original RRT* algorithm without collision checking (except for the maximum joint angle) is used in
this chapter, but it is modified to work with the steering functions that are used in this chapter. It will
be explained later on what these modifications are.
In Section 5.2 the model of the one dof Resonating Arm is described. An analytical steering function
was attempted to be derived in Section 5.3. Since no analytical steering function could be derived, an
approximated analytical steering function is used for simulations in Section 5.4. Another approach in
which a numerical steering function is used for the simulations, is followed in Section 5.5. This chapter
ends with a conclusion in Section 5.6 where the results of the two different steering functions are compared.
5.2
Model of the One Dof Resonating Arm
In this section the model of the system is discussed since it is used to obtain the steering function for this
system. The one dof Resonating Arm is nonlinear due to parallel spring mechanism which is incorporated
in the model of the one dof RA. Next to that is friction also included which makes the system even more
non-linear. In this section the friction and the motor model are discussed first. After that, the model of
the arm will be discussed and at the end of this section the equations of motion that follow from these
models will be given.
5.2.1
Friction Model
In order to have a realistic model of the Resonating Arm, friction should also be included. The types of
friction that are modeled are coulomb friction and viscous friction. The total friction force is the sum of
these two types of friction. The coulomb friction (Tf,c ) and viscous friction (Tf,v ) were determined by
36
means of system identification performed by M.C. Plooij [33]. The total friction torque is given by:
Tf,total
Tf,c
Tf,v
= Tf,c + Tf,v

 −(Ts + Tm )
=
−sign(Ts + Tm ) · fc

−sign(θ̇) · fc
(5.1)
if θ̇ = 0 and |Ts + Tm | ≤ fc
if θ̇ = 0 and |Ts + Tm | > fc
if θ̇ 6= 0
= −fv · θ̇
(5.2)
(5.3)
with:
Ts
=
torque of the spring mechanism
Tm
=
torque of the actuator
This shows that the coulomb friction is opposite to the direction of movement when the velocity of the
joint is non-zero. When the velocity is zero the coulomb friction is equal in magnitude and opposite in
direction to the sum of the torque of the actuator and the spring mechanism. Thereby the coulomb friction
is maximum fc . The viscous friction is opposite to the direction of movement and is linear dependent on
the speed. The values of the friction for the joint are given in Table 5.1.
Joint
Friction (Coefficient)
Value
1
fc,1
fv,1
0.4793 [Nm]
0.0003 [Nm · s/rad]
Table 5.1: Friction coefficients of the Resonating Arm determined by means of system identification.
5.2.2
Motor Model
For the actuation of the one dof Resonating Arm a Maxon RE30 motor is used as was already mentioned
in Section 2.2. The model of this motor is already given in the previous chapter in Table 4.1.
5.2.3
Equations of Motion
The model of the RA with one dof is given in Figure 5.1. This figure contains the positive direction of
the torque and the angle and the parameters of the model which are also listed in the table. The torque
(T1 ) is the sum of the torque of the spring mechanism, the actuator and the friction.
This model was used by M.C. Plooij [33] to derive the equations of motion. These equations are not
derived as part of this thesis, but they are just taken over from the work of Plooij [33]. For the sake of
completeness these equations are given here beneath. The variables x1 and x2 are the angle (θ1 ) and the
angular velocity (ω1 ) of the system respectively.
x˙1
= x2
x˙2
=
(5.4)
Ts + Tm,1 + Tf,1
m1 · l12 · lg12 + I1 + Ir1 · n21
(5.5)
These equations of motion seem to be linear but this is not the case, since the torque of the spring
mechanism Ts is dependent on the angle θ1 and the friction torque is dependent on θ1 as well as on ω1
and Tm,1 .
5.2.4
Conclusion
The model of the one dof Resonating Arm was discussed in this section. It was explained how the motor
and friction are modeled and how the arm itself is modeled. The inertia of this system is the same as
that of the one dof linear robotic arm, but the total equations of motion are quite different due to the
spring mechanism.
37
Parameter
Symbol
Mass
Mass moment of
inertia at com
Length
Relative location
of com
Maximum range
m1
I1
0.6
0.3
[kg]
[kgm2 ]
l1
lg1
0.4
0.5
[m]
[-]
θmax,1
Value
[− π3 , π3 ]
[rad]
Figure 5.1: Kinematic Model of the 1 dof Resonating Arm with the parameters of the model given in the
table.
5.3
5.3.1
Steering Function
Introduction
The equations of motion of the previous section are used to derive a steering function. The approach of
the previous chapter resulted in an analytical steering function. This section follows the same (indirect)
approach with the Pontryagins Minimum Principle.
5.3.2
Pontryagins Minimum Principle
The same steps as in the previous chapter are taken to solve the optimal control problem with Pontryagins
Minimum Principle (PMP). The problem description is the same as in Section 4.3.2 except for the model
of the system. The system is simplified by using that x2 6= 0, because in that way the friction torque is
only dependent on the x2 . The system now becomes:
x˙1 = x2
(5.6)
u + F (x1 (t), x2 (t))
Itot
with u = Tm,1 , F (x1 (t), x2 (t)) = Ts + Tf and Itot = m1 · l12 · lg12 + I1 + Ir1 · n21
x˙2 =
(5.7)
Step 1
The Hamiltonian is:
H(x(t), u(t), λ(t), t) = 1 + λ1 (t)x2 (t) + λ2 (t)
u(t) + F (x1 (t), x2 (t))
Itot
(5.8)
with λ1 and λ2 two costates
Step 2
The Hamiltonian should be minimized w.r.t. x(t):
min (1 + λ1 (t)x2 (t) + λ2 (t)
|u|≤Tmax
u(t) + F (x1 (t), x2 (t))
) → u∗ (t) = −sign(λ∗2 (t))Tmax
Itot
38
(5.9)
Step 3
The optimal control u∗ (t) should be plugged back into equation 4.6 to find the optimal Hamiltonian:
H∗ (x∗ (t), u∗ (t), λ∗ (t), t) = 1 + λ∗1 (t)x∗2 (t) + λ∗2 (t)
−sign(λ∗2 (t))Tmax + F (x∗1 (t), x∗2 (t))
Itot
(5.10)
Step 4
The differential equations that have to be solved are now:
∂H
x˙∗1 (t) =
= x∗2
∂λ∗1
−sign(λ∗2 (t))Tmax + F (x∗1 (t), x∗2 (t))
∂H
=
x˙∗2 (t) =
∗
∂λ2
Itot
∗
∂H
∂F
λ
(t)
2
λ˙∗1 (t) = − ∗ = − ∗
∂x1
∂x1 (t) Itot
∂H
∂F λ∗2 (t)
λ˙∗2 (t) = − ∗ = −λ∗1 (t) −
∂x2
∂x∗2 (t) Itot
(5.11)
(5.12)
(5.13)
(5.14)
These costate equations cannot be solved directly as was the case for the linear one dof system. So the
boundary conditions are needed to solve this optimal control problem. For free final time and fixed final
state these boundary conditions are:
x(t0 )
= x0
(5.15)
x(tf ) = xf
∂S
F (x1 (tf ), x2 (tf )) − sign(λ2 (tf ))Tmax
0 = H∗ +
= 1 + λ1 (tf )x2 (tf ) + λ2 (tf )
∂t
Itot
(5.16)
(5.17)
Inspection of the costate equations and boundary conditions now shows that the optimal control is of the
bang-bang type, because no singular control can exist. For singular control λ2 should be zero for some
time interval. This implies that λ˙2 = 0. If λ2 = 0 then λ˙1 = 0 and for λ˙2 to be zero λ2 = 0 λ1 also has to
be zero. If the costates are both zero and their derivatives as well, they cannot change to another value
anymore, so λ1 (tf ) and λ2 (tf ) are both zero. If this is plugged into the boundary condition this yields
0=1. Thus singular control is not possible for this system.
In order to solve this optimal control problem the state equations together with the costate equations
should be solved while satisfying the boundary conditions. This cannot be done analytically because of
the sign function that appears in the state equation. Therefore this problem should be solved numerically
with an iterations scheme like that given in Algorithm 3.
Algorithm 3 Iterations scheme to solve the Boundary Value Problem that arises from the PMP formulaton [31]
1: Assume initial value for λ∗ (0)
2: Compute the costate(s) λ∗ (t)
3: Use the costate to evaluate the control
4: Use the control u∗ (t) to integrate the equations of motion
5: Monitor x∗ (t) and find if there is a tf such that x∗ (tf ) is equal to the fixed final states otherwise
return to 1
This optimization scheme can be used to find the optimal control, but it is not very intuitive. Since it
was concluded that the optimal control must be of the bang-bang type, also the switch time(s) of the
bang-bang controller can be optimized. This comes down to the same, but is more intuitive. Since the
maximum actuator torque is about two times the maximum torque of the spring mechanism, it is not
needed to move forward and backward more than once to reach any state within the state space of the
39
one dof Resonating Arm. So only one switch time and the final time have to be optimized. This approach
will be indicated as the numerical steering function.
Next to this numerical approach to obtain a steering function, also an analytical one can be followed,
because this is probably much faster. This can be done by approximating the equations of motion by
assuming that the friction torque and the torque of the spring mechanism are constant for short trajectories. This is kind of a linearization of the equations of motion, so the final trajectory that is obtained
should be fed to trajectory tracker to convert the trajectory that was found into one that can be used for
the one dof Resonating Arm.
Both these approaches will be implemented and the results will be compared to learn which approach
is better. In the next section, the time-optimal trajectory for this system will be obtained with the
numerical steering function first.
5.3.3
Time-Optimal Trajectory
Similar to the one dof linear system the time-optimal trajectory for the workspace without obstacles can
be found by using the steering function ‘stand-alone’. This is done with the numerical steering function,
which will be discussed in detail in Section 5.5. The time-optimal trajectory from [-0.88,0] to [0.88,0] is
given in Figure 5.2. The time of this trajectory is 0.3290 [s]. What stands out is that the trajectory is
not symmetric as it was for the linear system. The point at which the control is switched from maximal
positive to maximal negative is around 0.1 [rad], this is after 0.1832 [s]. The reason for this is the friction
which caused the torque to move in the opposite direction.
Figure 5.2: Time-optimal trajectory in state space for the one dof RA.
5.3.4
Conclusion
This section showed that it is not possible to derive an analytical steering function for the one dof Resonating Arm. However, it was found with Pontryagins Minimum Principle that the time-optimal control
is of the bang-bang type. The number of switches did not follow from PMP, but it was concluded that
only one switch is needed, since the maximum actuator torque is about two times as big as the maximum
torque of the spring mechanism.
Given that the time-optimal control is of the bang-bang type with one switch, two approaches to obtain
a steering function were proposed: an approximated analytical steering function and a numerical steering
function. These two steering functions will be discussed in the next sections.
40
The numerical steering function was already used in this section to obtain the time-optimal trajectory
between the start and goal state. The trajectory that was found was not symmetrical due to the friction
in the system. The time of the trajectory was 0.3290 [s]. This is faster than the time-optimal for the one
dof linear system (0.3353 [s]), which is the RA without the friction and the spring mechanism. So the
spring mechanism results in faster movements, even when the friction is not included in the model of the
one dof linear arm.
5.4
5.4.1
Approximated Analytical Steering Function
Introduction
In the previous section it was already explained shortly what is meant by approximated analytical steering
function. Basically this is an analytical steering function that uses approximated equations of motion.
The steering function does not match perfectly to the real system, but the trajectories found become
better when the distances between the vertexes becomes s smaller. After the RRT* algorithm has finished
its iterations, the best trajectory found between the start and goal state is fed to a trajectory controller,
that translates the found trajectory to a trajectory that can be followed with the real system.
The benefit of this approach is that an analytical steering function can be used, which is in principle
much faster than a numerical one. The downside is that the trajectory tracker should be good enough,
because a slight deviation from the trajectory that was found can result in a collision with an obstacle.
This problem can be overcome by increasing the safety margins around obstacles.
5.4.2
RRT* Algorithm
There are two modifications to the RRT* algorithm that are used in this section. The first modification
is, of course, the steering function. Another system is used in this chapter, thus another steering function
is needed. Next to that, because an approximated analytical steering function will be used, also a
trajectory tracker is needed. This tracker will be used to translate the trajectory that was found with
the RRT* algorithm into a trajectory that can be executed with the one dof Resonating Arm. So this
is a postprocessing step and it is arbitrary whether this tracker is really part of the RRT* algorithm or
not. In this thesis it will be used as if it is part of RRT* algorithm.
Steering Function
The approach that is discussed here is basically the same approach as that for the one dof linear system.
The forced trajectories of the two states are set equal and in this way the switch point is determined.
The problem is that the forced trajectories for this system cannot be calculated analytically. In order to
deal with that, the equations of motion for this system are simplified to:
x˙1 = x2
F +u
x˙2 =
Itot
(5.18)
(5.19)
With F (the spring and friction torque) being a constant torque instead of a function of the angle and
angular velocity. So this torque is assumed to be constant during the forced trajectories. It is calculated on
forehand for the given start and goal state and then plugged into the equations of the forced trajectories.
This constant F is determined for both the start (F1) and goal (F2) state and is used in corresponding
equations of the forced trajectories.
So with the initial conditions given as:
x1 (0) = θ0
(5.20)
x2 (0) = ω0
(5.21)
41
The state equations can be solved:
1 F + u∗ 2
x1 (t) = θ0 + ω0 · t +
t
2
Itot
F + u∗
x2 (t) = ω0 +
t
Itot
(5.22)
(5.23)
Now equation 5.22 can be rewritten and plugged into equation 5.23, to solve the equation for x2 . When
first maximum positive torque is applied and is then switched to maximum negative torque, the resulting
equation becomes:
Itot
1
Itot
1
Itot
1
Itot
1
+ x22
= θ1 − ω12
+ x22
θ0 − ω02
2 F1 + Tmax
2 F1 + Tmax
2 F2 − Tmax
2 F2 − Tmax
s
x2 = ±
2(θ1 − θ0 )(F1 + Tmax )(F2 − Tmax ) − ω12 (F1 + Tmax )I + ω02 (F2 − Tmax )I
I(−2Tmax − F1 + F2 )
(5.24)
(5.25)
For first -Tmax and then +Tmax the equation is:
1
I
1
I
1
I
1
I
θ0 − ω02
+ x22
= θ1 − ω12
+ x22
2 F − Tmax
2 F − Tmax
2 F + Tmax
2 F + Tmax
s
x2 = ±
2(θ1 − θ0 )(F1 − Tmax )(F2 + Tmax ) − ω12 (F1 − Tmax )I + ω02 (F2 + Tmax )I
I(2Tmax − F1 + F2 )
(5.26)
(5.27)
The velocity at the point where it is switched (x2 ) can now be used in the same manner as was done in
Section 4.3. With these equations the time-optimal trajectories between two states can be calculated.
The problem now is that the optimal trajectory from the start vertex of the tree to the goal vertex of
the tree that is returned by the RRT* algorithm is not directly applicable for the real system. Therefore
a trajectory controller is needed to convert the trajectory that was found to a trajectory that can be
used for the real system. The quality of the trajectory depends both on the trajectory controller and the
quality of the trajectory that was found with the RRT* algorithm.
Trajectory Controller
The trajectory controller that is implemented is a very basic one. It is a PD-controller which calculates
the differences in angle and velocity and multiplies these with a gain to obtain the torque that should
be exerted. The set points for this controller are the vertexes of the best trajectory that is found. The
controller switches to a new set point when the link of the Resonating Arm has passed the angle of the
current set point or when the time between two set points in the original trajectory is surpassed. The
gains of the proportional and derivative action are tuned until a satisfactory performance was obtained.
The gains are: Kp = 5 and Kd = 50.
5.4.3
Method
First some initial tests were done to test the algorithm and trajectory controller and get a feeling of the
parameter settings. Their results are used for the simulation that is done to check the performance of
the algorithm with the approximated analytical steering function. This simulation is only done for one
run instead of ten, which was done in the previous chapter. This is done because the simulation time is
strongly increased and the results of the previous chapter show that the convergence of individual runs
is approximately the same. The details of this simulation can be found below.
42
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0)
(0.88,0)
100000
500
0.1
10%
[(− π3 , π3 ), (−12.5, 12.5)]
5
50
Trajectory controller
5.4.4
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
Kp
Kd
Results
Only one simulation was done with the approximated analytical steering function and the result of this
can be found in Figure 5.3. Figure 5.3a shows the trajectory that was found with this steering function
and trajectory controller. The magenta trajectory is the one that is returned by the trajectory controller,
that has used the blue dots as set points. It is quite similar to the optimal (green) one except for the
fact that the peak is a bit lower. When it is closely inspected, however, it shows that the magenta curve
just right of the top is not very smooth. This has a negative effect on the time of the trajectory which
is 0.3420 [s]. This time is 3.95 % slower than the optimal trajectory. Figure 5.3b shows the time of the
Figure 5.3: The trajectory found with the RRT* algorithm with approximated analytical steering function. A) The trajectory found (magenta) compared with the optimal one (green). The blue dots are the
set points that were used with the trajectory controller. B) The time of the best trajectory found with
the RRT* algorithm, before the trajectory controller. The red line is the optimal time of the trajectory.
best trajectory found during the tree growing process. This shows that the cost of this trajectory comes
very close to that of the optimal one, but this the higher cost is caused by the mismatch between the
approximated steering function and the real system. The calculation time that was used was about 12
hours, so this is quite substantial.
43
5.4.5
Conclusion
An approximated analytical steering function was obtained by holding the friction and spring torque
constant during the two segments of the bang-bang control. In this way a fast steering function was
obtained which approximates the real trajectories better if the cutoff distance is smaller. This small
cutoff distance tempers the performance, as was concluded in the previous chapter. The performance is
also dependent on the quality of the trajectory controller that is used to convert the trajectory that was
obtained to the one that can actually be executed by the one dof Resonating Arm.
The time of the trajectory that was obtained after 12 hours of simulation was within 4 % of the real
time-optimal solution. This performance can probably be improved slightly by using more set points and
a better tuned trajectory controller. However, this approach is probably not the best when it is the goal
to obtain solutions that are really close to the optimal. It can however be a good approach when the goal
is to find good trajectories instead of optimal ones.
5.5
5.5.1
Numerical Steering Function
Introduction
The idea of using a numerical steering function for the one dof Resonating Arm was introduced in Section
5.3. At first this idea was rejected, because a numerical steering function would be much slower than an
analytical one. However, there is no other way to obtain a steering function that uses the real equations
of motion instead of linearized or approximated ones. Besides, in the latest publication about the RRT*
algorithm [15] a numerical steering function was used as well. For these reasons, this approach was chosen
to be tested.
5.5.2
RRT* algorithm
The RRT* algorithm is basically the same as the one that was used in the previous chapter, but two
modifications are needed. Of course another steering function is used, in this case a numerical one. A
result of that is that the steering function relies on some optimization criteria. This implies that the
trajectory found does not exactly reach the goal state. There will be a small offset, depending on the
tolerances and the optimization criteria. Therefore the rewiring procedure is needed. What this procedure
does exactly will be discussed later on.
Steering Function
The numerical steering function relies on optimizing the time of the two segments of the bang-bang
control signal. Because two possible control sequences exist(+ - and - +), two separate optimizations
have to be done.
The optimization can be done by a general purpose optimization routine. The design variables of this
optimization are the times of the two segments of the bang-bang control (t1 and t2 ). The objective
function takes these two as input and integrates the equations of motion for these two time lengths
starting at the start state. The cost function is a weighted sum of the scaled euclidean distance between
the start state and the goal state plus the total time of the trajectory:
Fval = w1 ||xgoal − xstart || + w2 (t1 + t2 )
(5.28)
This optimization should be ran for both the control sequences (+ - and - +). A trajectory is feasible if
the scaled euclidean distance between the start state and the goal is smaller than a certain limit. If both
switch sequences result in a feasible trajectory, the one with the lowest total time is selected. Since the
length of the time segments can converge to zero during the optimization, it is also possible to obtain a
bang-bang control without a switch. This is, however, not very likely to occur.
44
The downside of using this optimization as steering function is that an optimization algorithm cannot
calculate a trajectory exactly to the goal state, so there is a certain error in it. The cost of the trajectory
may also have a small error compared to the optimal cost to get from one state to another. Finally there
is the problem that not the global optimum is found but a local optimum or no optimum. But for this
(relatively) simple problem this should not be a problem.
Rewire
The problem of trajectories that do not exactly reach the goal vertex was already addresses shortly before.
The problem occurs when vertexes that are already in the tree are connected through new edges at lower
costs (the second optimization step as explained in Section 3.2.2). Since the trajectory does not reach
the goal vertex exactly, there is a gap between this trajectory and the trajectories from this vertex to its
child vertexes. The trajectories do no longer connect.
This problem can be solved by applying the controls that belong to trajectories between the vertex and
its child vertex again, but this time from the new state of the vertex. Since the difference between the
previous end the new state of the vertex is only very little it does not influence the trajectories much.
The process of rewiring should now also be done for the trajectories between the child vertexes and their
child vertexes. This process should be continued until the ends of the branches are reached.
5.5.3
Method
For this steering function also some initial tests are done with the RRT* algorithm to test the algorithm
and get some idea of how long a simulation takes. Then the real simulation is done that will be used
to judge the performance of the algorithm in terms of quality of solution versus calculation time. The
details of the simulation are:
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0)
(0.88,0)
10000
200
0.5
10%
[(− π3 , π3 ), (−12.5, 12.5)]
5.5.4
# iterations
# nearest neighbors
cutoff distance
sample goal
sample domain
Results
The result of the simulation is given in Figure 5.4. Figure 5.4a shows the trajectory found with the RRT*
algorithm (magenta) which is almost identical to the optimal trajectory. The time of the trajectory found
with the RRT* algorithm is 0.3316 [s] which is only 0.78 % from the optimal trajectory.
Figure 5.4b shows that the convergence is good. The only downside is the calculation time. The simulation
did cost about 100 hours, which is slightly too long to be practical. This is caused by the fact that the
convergence becomes slower when the cost of the trajectory becomes closer to the optimal.When it is the
goal to be within 2% of the optimal, the calculation time is around 21 hours and when the goal is to be
within 5% of the optimal, a simulation of only approximately about 80 minutes is needed.
5.5.5
Conclusion
The numerical steering function for the one dof Resonating Arm relies on the optimization of a bang-bang
controller with one switch. The optimization for the two possible switch sequences is done separately and
the best result is used for the trajectory between the two vertexes it was called for.
45
Figure 5.4: The trajectory found with the RRT* algorithm with numerical steering function. A) The
trajectory found (magenta) compared with the optimal one (green). B) The blue line is the time of the
best trajectory found with the RRT* algorithm. The red line is the optimal time of the trajectory.
Simulations showed that this steering function is much slower than an analytical one, since the simulations
took over four days. The quality of the solution, however, is quite good since the cost of it is within 1% of
the optimal cost. The computational costs to get within 2% of the optimum are 21 hours, so that would
be acceptable.
5.6
Conclusion
The one dof Resonating Arm has non-linear equations of motion. Due to that it is not possible to obtain
an analytical steering function. However, it is learned from the application of Pontryagins Minimum
Principle that the time-optimal control is of the bang-bang type. It did not follow from the PMP how
many switches are needed at most, but from inspection of the properties of the system it was concluded
that at most one switch is needed to reach all states within the state space of the system. This followed
from the fact that the maximum torque is about two times as big as the maximum torque exerted by the
spring mechanism.
This result was used to construct two steering functions of which the result are compared. The first
steering function approximates the equations of motion of the system. In that way an analytical steering
function is obtained which resembles the actual dynamics of the system better when the length of the
trajectories becomes smaller. Since there is an error in the trajectory that is returned with this steering
function, a trajectory controller is necessary to convert the trajectory into one that can be executed by
the one dof RA. The second steering function is a numerical one that relies on an optimization of the
time lengths of the two segments of the bang-bang control. The trajectories returned with this steering
function have a final state that is not exactly the goal state. This problem is solved by rewiring the tree.
The numerical steering function was used ’stand-alone’ to obtain the time-optimal trajectory between
the start and the goal state. The time length of this trajectory is 0.3290 [s], which is faster than the
time-optimal movement for the one dof linear arm of Chapter 4. The properties of the arm are the same
as those of the one dof RA, with the only difference being that it has no spring mechanism and the
friction is not modeled. Thus it can be concluded, at least for the start and goal point used in this thesis,
that the spring mechanism gives the RA an advantage in terms of speed.
46
The optimal trajectory was compared with simulations of the RRT* algorithm with both of the steering
functions. The best trajectory found with the approximated analytical steering function was about 4%
longer that the optimal, so that is not too bad. The numerical steering function found a trajectory
of which the time length was only 0.78% more than the optimal trajectory, so that is excellent. The
downside is that this took about 100 hours of simulation, while the simulations with the approximated analytical steering function took only twelve hours. So it seems that the approximated analytical
steering function is faster, but if Figure 5.4b is inspected it shows that after twelve hours with the numerical steering function the best trajectory found is 2.6% from the optimal. This is already better
than that of the approximated analytical steering function. Thus it can be concluded that the numerical
steering function performs better overall and therefore this approach will also be used for the two dof RA.
For the two dof RA probably a more complex optimization is needed for the steering function and the
search space is 4D instead of 2D. This probably results in calculation times that are much greater. To
tackle this problem the code of the algorithm should be optimized and converted to C-code if possible. In
this way it should be possible to use a numerical steering function without too long calculation times.
47
Chapter 6
Simulations and Results for the Two
Dof Resonating Arm
6.1
Introduction
The results of the previous chapter showed that it is possible to find time-optimal trajectories for systems
for which no analytical steering function exists. Thus this approach is continued to find the time-optimal
trajectories for the two dof Resonating Arm. In this chapter the answer to the research question ”What
are the time-optimal trajectories of the Resonating Arm?” will be given.
In this chapter simulations with obstacles will be done, to check how the time-optimal trajectories are
influenced by these obstacles. Due to these obstacles it is no longer possible to find the time-optimal
trajectories with a (simple) optimization algorithm. This is where the RRT* algorithm comes in. So
the trajectories that are found with obstacles in the workspace cannot be compared to some reference
trajectory anymore. Therefore it is necessary to know that the time-optimal trajectories will be found
with the RRT* algorithm with its current setting. This can be done by doing the simulation without
obstacles first, so that this trajectory can be compared with the time-optimal trajectory that can be found
with a straightforward optimization. When the RRT* algorithm performs good enough and finds the
time-optimal trajectory, it will be applied to the problem of an obstacle being presented in the workspace.
The outline of this chapter is the following. First in Section 6.2 the model of the two dof RA is described.
The corresponding steering function is explained in Section 6.3. The way in which the collision checking
is performed is explained in Section 6.4. In Section 6.5.1 the first simulations are done with the original
RRT* algorithm. Since these simulations are not satisfactory, the algorithm is adjusted and in Section
6.6 new simulations are done. These simulations show that the performance of the algorithm is tempered
by the steering function, so in Section 6.7 different steering functions are compared. This indicates that
the distance metric was the problem. That is why in Section 6.8 a better distance metric is constructed.
With these modifications to the algorithm, simulation with and without an obstacle were done in Section
6.9. This chapter is ends with a conclusion in Section 6.10.
6.2
Model of System
This section presents the model of the two dof Resonating Arm. Both friction and the motor dynamics
are included in this model. The model of the is RA needed for the steering function, therefore it is
important to discuss it.
6.2.1
Friction Model
The friction model was already given in the previous chapter in Section 5.2.1. There only the values of
the friction for the first dof were given. In Table 5.1 the friction parameters of both dof are given.
6.2.2
Motor Model
Two Maxon RE30 DC motors are used for the actuation of the RA. The rotor inertia of these two motors
is taken into account in the equations of motion of the two dof RA. Next to that, the motors are considered
48
Joint
Friction (Coefficient)
Value
1
fc,1
fv,1
fc,2
fv,2
0.4793
0.0003
0.6000
0.0006
2
[Nm]
[Nm · s/rad]
[Nm]
[Nm · s/rad]
Table 6.1: Friction coefficients of the Resonating Arm determined by means of system identification.
as torque sources of which the maximum output is Tmax,motor . The relevant motor parameters are given
in Table 6.2.
Motor
Parameter
Symbol
1
Gear ratio
Gearbox efficiency
Rotor inertia
Maximum motor torque
Maximum torque
Gear ratio
Gearbox efficiency
Rotor inertia
Maximum motor torque
Maximum torque
n1
η1
Ir,1
Tmax,motor,1
Tmax,1
n2
η2
Ir,2
Tmax,motor,2
Tmax,2
2
Value
54
0.87
33.5·10−7
0.085
3.99
198
0.87
33.5·10−7
0.085
14.64
[-]
[-]
[kg m2 ]
[Nm]
[Nm]
[-]
[-]
[kg m2 ]
[Nm]
[Nm]
Table 6.2: Relevant motor parameters
6.2.3
Equations of Motion
The model of the Resonating Arm is given in Figure 6.1. This figure contains the positive directions of
the torques and angles. The parameters of the model are listed in the table. Torque T1 is the sum of the
torque of the spring mechanism (Ts ), the actuator (Tm,1 ) and the friction (Tf,1 ). Torque T2 is the sum
of the torque of the actuator (Tm,2 ) and the friction (Tf,2 ) and is working on the second link.
This model was used by M.C. Plooij [33] to derive the equations of motion. These equations are not
derived as part of this thesis, but they are just taken over from the work of Plooij [33]. For the sake of
completeness these equations are given here beneath. The variables x1 and x2 are the angles of the arm
θ1 and θ2 respectively. The angular velocities are x3 and x4 for the first and second joint respectively.
x˙1 = x3
(6.1)
x˙2 = x4
(6.2)
2
I2,tot (Ts + Tf,1 + Tm,1 ) − c1 cos(x1 − x2 ) Tf,2 + Tm,2 + c1 sin(x1 − x2 )x3
x˙3 =
(6.3)
I2,tot (I1,tot + c2 ) + c21 cos(x1 − x2 )2
(I1,tot + c2 ) Tf,2 + Tm,2 + c1 sin(x1 − x2 )x23 − c1 cos(x1 − x2 ) Ts + Tf,1 + Tm,1 − c1 sin(x1 − x2 )x24
x˙4 =
I2,tot (I1,tot + c2 ) + c21 cos(x1 − x2 )2
(6.4)
49
Parameter
Symbol
Mass link 1
Mass link 2
Mass moment of inertia link 1
Mass moment of inertia link 2
Length link 1
Length link 2
Relative location com link 1
Relative location com link 2
Joint limits dof 1
Joint limits dof 2
m1
m2
I1
I2
l1
l2
lg1
lg2
θ1,max
θ2,max
Value
0.6
0.6
0.3
0.3
0.4
0.4
0.5
0.5
[− π3 , π3 ]
130
[− 130
180 π, 180 π]
[kg]
[kg]
[kgm2 ]
[kgm2 ]
[m]
[m]
[-]
[-]
[rad]
[rad]
Figure 6.1: Model of the two dof Resonating Arm with parameters
with constants:
2
I1,tot = I1 + Ir1 n21 + l12 lg1
m1
Total inertia of link one at joint one
(6.5)
Total inertia of link two at joint two
(6.6)
c1 = l1 l2 lg2 m2
Coupling term
(6.7)
l12 m2
Coupling term
(6.8)
I2,tot = I2 +
c2 =
Ir2 n22
+
2
l22 lg2
m2
The equations of motion of the RA thus are very non-linear. This is caused by the coupling between
the two degrees of freedom. These equations of motion will be used in the steering function to find the
time-optimal trajectories between two states of the RA.
6.2.4
Conclusion
The model of the Resonating Arm and the resulting equations of motion were given in this section. The
equations of motion are quite complex due to the coupling of the two dof. Something that is notable is
that the maximum torque of the actuator of the lower arm (second dof) is more than 14 [Nm] which is
quite a lot compared to that of the first dof which is only about 4 [Nm]. So the lower arm will probably
be able to accelerate fast and move at high speeds, but that is something that will come out later. The
equations of motion will be used for the steering function that is derived in the next section.
50
6.3
6.3.1
Steering Function
Introduction
The previous section showed that the equations of motion are quite complex. Pontryagins Minimum
Principle is only used to conclude that the time-optimal control is of the form u = sign(x) (with x a
complex expression). This can be concluded since the system is linear in control. This makes it look like
the control is of the bang-bang type, but x can also be zero. If that is the case for some time interval,
the time-optimal control for that interval is a singular arc. The time-optimal control for the two dof
Resonating Arm is a combination of bang-bang segments and singular arcs. The exact time-optimal
control totally depends on the start and goal state of the motion. Since no analytical steering function is
possible, a numerical steering function should be used.
6.3.2
Numerical Steering Function
It is not obvious now how the optimal control problem can be solved numerically. In Section 3.3.3 three
numerical methods to solve an optimal control problem were discussed. The indirect approach with the
PMP is already rejected, so dynamic programming and a direct method remain. The problem with these
methods is that they do not use any knowledge of the system, which makes them very slow.
A much better approach is to use bang-bang control theory. The time-optimal control for systems that
are linear in input consists of bang segments together with singular arcs. Much research is done on
time-control for two dof manipulators [12, 28, 39, 42] and this research suggests that bang-bang control
for such systems leads to (near)time-optimal control. Therefore it is chosen to use the optimization of
a bang-bang control input as a steering function. This is a relatively cheap optimization compared to a
more general approach. In her article, Sirisena [39] states that singular control can be seen as bang-bang
control with infinite switches. So when the time-optimal trajectory of the Resonating Arm consists of
a singular arc, this singular arc will be approximated by bang-bang segments. This approximation will
become better when the number of bang-bang segments increases.
The research of Willigenburg et al. [41] shows that the chance that time-optimal bang-bang control
consists of more than n-1 (n = number of states of the system) switches is very small. Since the number of states of the model of the RA is four, it is chosen to optimize a bang-bang signal with three switches.
Figure 6.2: Actuator torques of typical a bang-bang control sequence with three switches. sw1..3 are the
switch times and t1..4 the length of the time intervals between the switches.
An example of a switching pattern is given in Figure 6.2. This Figure shows that three switches (sw1..3 )
result in four time intervals (t1..4 ). The length of these time intervals as well as the switching sequence
should be optimized. Next to that the initial values of the torques should also be taken into account.
51
There are multiple ways in which this optimization problem can be implemented and some tests were
done to check which is the best way. These tests showed that the best way is to optimize the length of
the four time intervals for every switch sequence and every set of initial torques independently and then
use the best solution.
There are four different sets of initial torques possible (++,−+,+−,−−) and eight possible switch sequences (111, 112, 121, 211, 122, 212, 221, 222) (where 1 denotes a switch in Tm,1 and 2 a switch in Tm,2 ).
So a total of 32 optimizations has to be done. MATLAB’s function fmincon is used for the optimization.
The cost function is comparable to the one used in Section 5.5:
Fval = w1 ||xgoal − xstart || + w2 (t1 + t2 + t3 + t4 )
(6.9)
For all 32 results it is checked whether the trajectories reach the goal closely enough. This is done with
a cutoff distance. For the trajectories of which the end point is within the cutoff distance from the goal
the one with the lowest time is selected. This is how the steering function is implemented.
6.3.3
(Near) Time-Optimal Trajectories
To test the steering function it is used to obtain a trajectory from the start (xstart = [-0.88, 0, 0, 0]) to the
goal state (xgoal = [0.88, 0, 0, 0]) for the workspace without obstacles. This trajectory is being compared
to one that is obtained with GPOPS [37], a MATLAB package for solving multiple-phase optimal control
problems.
When the steering function is used, two different trajectories are obtained that have almost the same
cost. The best one of these two is discussed first. In Figure 6.3 the state and control with respect to
time are given for the best trajectory. The total time of this trajectory is 0.5553 seconds. The movement
of the arm in workspace that belongs to this trajectory is given in Figure 6.4. This figure shows that
the end effector leads the elbow in the rotational movement, therefore this motion will be labeled as the
‘push-motion’.
Figure 6.3: States and control with respect to time for the (near) time-optimal movement from the start
to the goal state.
The second motion is almost as good as the ‘push-motion’, because the time length of this trajectory
is 0.5726 [s]. The states and control with respect to time are given in Figure 6.5. Figure 6.6 shows the
52
Figure 6.4: (Near) time-optimal movement of the RA in workspace. This motion is labeled as the
‘push-motion’ since the end effector leads the elbow in the rotational movement.
corresponding motion of the arm in the workspace. This motion is labeled as the ‘pull-motion’, because
the elbow leads the end effector in the rotational motion.
Figure 6.5: The states and the control with respect to time for the near time-optimal movement from
the start to the goal state.
MATLAB package GPOPS [37] was also used to obtain the time-optimal control to get from the start to
the goal state. The solution found with GPOPS was not completely converged, but is did cost 19 minutes
to get the result of Figure 6.7. This figure shows almost the same result as Figure 6.5. The control signal
found with GPOPS is a bit more spiky, since it was not converged totally, but for the rest it is the same.
The time of this trajectory is 0.5729 [s] which is practically the same as that of the ‘pull-motion’.
Since GPOPS uses polynomials to describe the control input it is a difficult problem for the algorithm
to find a bang-bang control. So if the optimal control would contain singular arcs, GPOPS would have
found them since it would then be able to approximate them more easy. Therefore it is likely that the
time-optimal control from the start to the goal state is indeed of the bang-bang type and is indeed the
‘push-motion’ that is given in Figure 6.4.
53
Figure 6.6: Near time-optimal movement of the RA in workspace. This motion is labeled as the ‘pullmotion’ since the elbow leads the end effector in the rotational movement.
Figure 6.7: The states and the control with respect to time for the near time-optimal movement from
the start to the goal state found with GPOPS [37].
54
6.3.4
Conclusion
Due to the complex equations of motion it is difficult to obtain a steering function that returns timeoptimal trajectories. It follows from the Pontryagin Minimum Principle that the time-optimal control
is a combination of bang-bang segments and singular arcs. In order to solve this time-optimal control
problem it is necessary to rely on a numerical method. Options for this numerical method are dynamic
programming and the direct methods mentioned in Section 3.3.3. These methods are capable of finding
a solution, but will need too much time because they do not use any knowledge of the system. Therefore
it is a better approach to optimize a bang-bang control signal. This is done because this is a relatively
cheap optimization that results in relatively good trajectories according to research on bang-bang control
for two dof manipulators.
The steering function that follows from the bang-bang optimization is used ‘stand-alone’ to calculate
the (near) time-optimal trajectory from the start to the goal state of the RA for the situation in which
no obstacles are presented. This results in two trajectories that are labeled as the ‘push-motion’ and
‘pull-motion’. The ‘push-motion’ is the best one which results in a trajectory of 0.5553 [s]. Since it is not
clear how close this trajectory is to the optimal one, GPOPS is used to find the time-optimal trajectory.
The solution returned with GPOPS is very similar to the ‘pull-motion’. This is caused by the fact that
the solution of GPOPS was not yet converged completely. Since GPOPS can approximate singular arcs
better than bang-bang segments, it is not likely that there exists a trajectory that includes singular arcs
which is better than the ‘push-motion’. Therefore it can at least be concluded that the ‘push-motion’
has a high probability of being very close to the time-optimal motion. Therefore the ‘push-motion’ as
described in this section will be considered the time-optimal trajectory and will be used to compare
results with.
6.4
6.4.1
Collision Checking
Introduction
Collision checking should be done for two kinds of collisions. The first being self-collision and the second
collision with obstacles. The checking on self-collision is already done by checking whether the arm does
not cross certain joint angles. The checking for collisions with obstacles is a more difficult one, since the
obstacles are presented in the workspace and the collision checking is done in the state space (or actual
configuration space, since the obstacles are independent of the velocities). So one option to overcome this
problem is to map the obstacles to the state space. This is very difficult since simple shaped obstacles in
workspace transform to rather complex shaped obstacles in state space. Another option would be to do
the collision checking in the workspace. In this way the states (or configurations) have to be calculated
back to a certain orientation of the RA in the workspace. This is a relatively simple calculation for which
only the kinematics of the system are needed, but it has to be done for all states on a trajectory in state
space.
6.4.2
Models
In order to be able to perform the collision check, an obstacle and an arm model are needed. These
models can be as detailed as possible, but that would result in long collision checking times. In this
research very simple models are used. The obstacle is modeled as a circle and the RA is modeled as two
sticks. In this way the obstacle can be described by the location of its center and its radius.
6.4.3
Collision Checking for a Single State in the State Space
The collision checking is done for the second link (lower arm) of the Resonating Arm since the upper arm
has no possibility to move around an obstacle. The state of the RA determines the begin and endpoint
of the second link of the RA. This can be calculated with a simple kinematic relation. The base of the
RA is at [0,0], the x-axis is pointing to the front and the y-axis to the right, which results in the following
55
kinematic relations:
x1
= l1 cos(θ1 )
(6.10)
y1
= l1 sin(θ1 )
(6.11)
x2
= l1 cos(θ1 ) + l2 cos(θ1 + θ2 )
(6.12)
y2
= l1 sin(θ1 ) + l2 sin(θ1 + θ2 )
(6.13)
a)
b)
Figure 6.8: Intersection between line and circle. a) Point P is the point on the line through P1 and P2
that is the closest to the center of the circle (P3 ) If d1 is greater than r, there is no collision. If d1 is
greater and P is between P1 and P2 then there is a collision. If P is not, the situation of b) applies. b)
When P2 is the closest point to the center of the circle (P3 ) on the line between P1 and P2 there is no
collision if d2 is greater than r.
Were x1 and y1 are the endpoint of the first link (upper arm) and x2 and y2 the endpoint of the second
link. Figure 6.8 is used to describe how the collision checking works. Figure 6.8a shows the obstacle,
with middle point P3 and radius r. The lower arm is the line between P1 (x1 ,y1 ) and P2 (x2 ,y2 ). Point P
is the point which is the closest to P3 on the line through P1 and P2 . Since P is on this line it can be
expressed by: P = P1 + c(P2 − P1 ), with c is a variable that expresses the location of the point. The line
between P3 and P is perpendicular to the line through P1 and P2 . Thus the dot product between these
two lines should be zero:
(P3 − P ) · (P2 − P1 )
=
0
(6.14)
(P3 − P1 − c(P2 − P1 )) · (P2 − P1 )
=
0
(6.15)
This can be rewritten to:
c =
(x3 − x1 )(x2 − x1 ) + (y3 − y1 )(y2 − y1 )
(x2 − x1 )(x2 − x1 ) + (y2 − y1 )(y2 − y1 )
(6.16)
Then the location of point P (xp ,yp ) can be calculated and the distance from point P to the center of
circle (dis1 ) can be calculated:
xp
=
x1 + c(x2 − x1 )
(6.17)
yp
=
(6.18)
dis1
=
y1 + c(y2 − y1 )
q
(xp − x3 )2 + (yp − y3 )2
(6.19)
When dis1 is greater than radius r, there is no collision, because then the closest point on the line through
P1 and P2 is outside the circle.
56
When 0 ≤ c ≤ 1, point P is on the line segment between P1 and P2 . Otherwise the point would be outside this segment. When it is between P1 and P2 and dis1 is smaller than radius r, then there is a collision.
When c > 1 the situation of 6.8b applies. The closest point on the arm to the center of the circle has to
be point P2 , due to fact that point P1 is attached to the upper arm. It has to be checked whether the
distance between the center and P2 is greater than the radius of the circle. If that is the case, there is
no collision. If it were otherwise, there would be a collision.
This is the procedure used to check for collision between the arm and the obstacle for a given configuration
of the arm. In the next section it will be explained how this is done for a trajectory is state space.
6.4.4
Collision Checking for a Trajectory in the State Space
A trajectory in state space cannot be checked for collisions right away, since the collision checker described
here above can only check for states and not lines in state space. The trajectory should be sampled to get
a list of states that can be checked for collision. A safety margin around the obstacle is needed, because
in the time between two samples the arm can in principle cause a collision, while it is not in collision at
the moment of the samples. This safety region is dependent on the maximum velocity of the arm, the
sampling time and the length of the robotic arm. The maximum velocity of the endpoint of the arm is:
varm,max
= ω1,max · (l1 + l2 ) + ω2,max · l2
=
(6.20)
15 · 0.8 + 22 · 0.4 = 20.8[m/s]
The sample time for collision checking is set to ts = 0.001 [s]. The safety region should be xsaf ety =
varm,max ts = 0.0208 [m]. This safety region is increased to 0.05 [m] so that the minimal distance between
the obstacle and the arm is 0.03 [m].
6.4.5
Conclusion
For the collision checking the arm is modeled as a line and the obstacle as a circle. In this way the
collision checking is quite simple and can be done quite fast. The safety margin around the obstacle is 5
[cm] which is a reasonable safety margin. The collision checking should only be done for the lower arm.
6.5
6.5.1
Standard RRT* Algorithm
Introduction
The RRT* algorithm as it was used in the previous chapter is used here to find the time-optimal trajectories for the two dof Resonating Arm. The steering function that will be used is already discussed
in Section 6.3. Parts of the code were converted to C-code in order to increase the speed. This should
make the algorithm faster so that trajectories can be found within reasonable time. In this section the
RRT* algorithm will be used to find the optimal trajectory between the start [-0.88, 0, 0, 0] and goal
state [0.88, 0, 0, 0]. The trajectory found will be compared to the time-optimal one obtained in Section
6.3.
6.5.2
Method
The RRT* algorithm that is used here is basically the same as the one that was used in the previous
chapter. The parts that are changed are the distance metric, the sampling domain and the steering
function. The distance metric is the euclidean distance where the angular velocity is scaled:
q
(6.21)
dis = (θ1,2 − θ1,1 )2 + (θ2,2 − θ2,1 )2 + (0.1(ω1,2 − ω1,1 ))2 + (0.1(ω2,2 − ω2,1 ))2
57
130π
The sampling region is: [(− π3 , π3 ), (− 130π
180 , 180 ), (−15, 15), (−20, 20)] and the steering function was already given in Section 6.3.
It was already described that the distance metric does not represent the actual length of the trajectory
very good. Therefore it not sure whether the chance on a successful connection is greater when a new
sampled point is connected to its nearest neighbor or to some other random vertex. It is doubtful whether
it has a benefit to try to connect a new sampled point in the state space first only to the nearest vertex
and when that is successful to also try to connect it at lower costs to the other k nearest neighbors.
Therefore two approaches will be compared by which:
1. The new sampled point in the state space is tried to connect to the nearest vertex first and when
this connection is possible, the new sample point is also tried to be connected through k other
nearest neighbors. When the connection with the nearest vertex is not possible a new sample is
taken from the state space.
2. The new sampled point in the state space is tried to connect to the k nearest neighbors right away.
In this way new samples are connected to the tree more often, but when a sample that is not
reachable is selected, k connection attempts that are doomed to fail are processed, which costs a
lot of calculation time.
These two approached are compared for a relatively short simulation time. The one that performs the
best is used for a longer simulation run. The simulations are done without obstacles so that the result
can be compared to the results of Section 6.3.3.
Standard RRT* Algorithm - connect first to nearest vertex
Settings
Value
Start
Goal
RRT* algorithm
[-0.88,0,0,0]
[0.88,0,0,0]
1 [hour]
50
1
10%
simulation time
# nearest neighbors
cutoff distance
sample goal
Standard RRT* Algorithm - connect to k nearest neighbors
Settings
Value
Start
Goal
RRT* algorithm
[-0.88,0,0,0]
[0.88,0,0,0]
1 [hour]
50
1
10%
simulation time
# nearest neighbors
cutoff distance
sample goal
58
Longer simulation for best performing of the two approaches
Settings
Value
Start
Goal
RRT* algorithm
[-0.88,0,0,0]
[0.88,0,0,0]
5000
50
1
10%
6.5.3
# iterations
# nearest neighbors
cutoff distance
sample goal
Results
In Figure 6.9 the tree created by a simulation of one hour with the first approach is plotted. In Figure
6.10 the tree created with the second approach is plotted.With the second approach a trajectory between
the start and goal is found (the green line). This trajectory does not look very good since it contains
a lot of loops for both dof. The time of this trajectory is 0.9019 [s], so that is indeed far from the optimum.
Figure 6.9: State space plots of both dof independently with the tree structure grown by the standard
RRT* algorithm. The blue dots are the vertices of the tree and the red lines are the edges that connect
the vertices. This are the results of a simulation of 1 hour with the standard RRT* algorithm where the
new sampled state is first tried to connect to the nearest vertex.
The fact that with the second approach a goal is found and not with the first is on itself not enough
to conclude that the second approach is better. Closer inspection of the trees shows that with the first
approach 119 vertices are added in one hour while for the second approach 135 vertices are added. So this
indeed shows that the second approach is better. Therefore this approach will be used for the simulation
of 5000 iterations.
The results of the simulation with 5000 iterations of the RRT* algorithm are in Figure 6.11. Figure
6.11a shows a plot of the trajectory found after 5000 iterations, where in Figure 6.11b is shown how the
best trajectory found by the algorithm improves during the growth of the tree. The trajectory of Figure
6.11a is not very good; it shows a lot of loops and the cost of this trajectory 0.9364 [s], is also quite high
59
Figure 6.10: State space plots of both dof independently with the tree structure grown by the standard
RRT* algorithm. The blue dots are the vertices of the tree and the red lines are the edges that connect
the vertices. This are the results of a simulation of 1 hour with the standard RRT* algorithm where the
new sampled state is tried to connect k nearest neighbors.
Figure 6.11: a) State space plot of both dof decoupled in one plot. The trajectories of both the dof show
a lot of loops, thus the trajectory found is not very good. b) Convergence of the best trajectory found
over time.
compared to the optimal one. Figure 6.11b tells the same story, because it shows that the best trajectory
found is only improved two times, which is very low for the given running time of the algorithm if we
compare this to the results of Section 5.5. Something that is remarkable is that the trajectory to the goal
was lost two times. This happened through the rewiring process, so one of the vertices on the trajectory
to the goal was connected at lower costs through a new vertex. Due to the rewiring process, however, the
end point of the trajectory was not close enough to the goal anymore. The number of nodes in the tree
60
after 5000 iterations (32 hours) is only 1018, this is quite little, so it is logical that the trajectory that is
found is not very good.
6.5.4
Conclusion
It is concluded that the standard RRT* algorithm with the numerical steering function as described
in Section 6.3 does not perform good enough. No time-optimal trajectories can be found by it within
reasonable time. Therefore the algorithm should be improved before it is used for simulation with an
obstacle in the workspace.
The implementation of the RRT* algorithm for the one dof Resonating Arm was good enough and that
is basically the same as the one used here. The reason that this current implementation is too slow is the
fact that the state space for the two dof system is 4D instead of 2D for the one dof system. So in order to
cover this space well a lot more vertices are needed. Another reason is that the steering function for the
two dof system is about a factor six slower than that of the one dof system. Since the steering function
is already implemented in a fast way, not much gain in speed can be expected from tuning it, so other
techniques have to be used to improve the RRT* algorithm for the two dof Resonating Arm.
6.6
6.6.1
Greedy RRT* Algorithm
Introduction
The conclusion of the previous section was that the RRT* algorithm in its current implementation is not
fast enough to perform the simulations within reasonable time. In this section an approach is followed
to overcome this problem. The basic RRT* algorithm is a general motion planning algorithm and thus
it can deal with different kinds of systems, varying from vehicles that are driving in a workspace, to high
dof robotic arms that are working in household environments. The motion planning problem wherefore
it is used in this thesis is a relatively simple one, since the RA has only two dof and the workspace
contains only one obstacle. This fact should be exploited to make the RRT* algorithm more suitable for
the problem that is dealt with in this thesis.
The way this is done is by exploiting the relative open workspace. The normal RRT* algorithm tries to
connect a new sampled state to the nearest vertex in the tree, and tries to optimize the trajectory toward
this new vertex by also trying connections from k other nearest neighbors. When a new sampled state
can be connected directly to the start vertex, the trajectory toward this new vertex is already optimal,
so it is not needed to check whether it can be reached from other vertices at lower costs. This is only
beneficial when the workspace is relatively open so there would be a good chance that the connection to
the start vertex is successful.
Another way to exploit the relatively open workspace, is by trying to connect to the goal state from every
new vertex. It this way extra emphasis is on the optimization of the trajectory to the goal, and not only
on the optimization of all the vertices in the tree. This extra emphasis is called greedy, that is why this
chapter is named ’Greedy RRT* Algorithm’.
6.6.2
Method
The modification of the standard RRT* algorithm was already discussed in the introduction of this section. In short, it is first tried to connect the new sample through the start vertex. When that is possible,
the algorithm skips the first optimization step and goes directly to the second optimization step. As part
of this second optimization step it is checked whether the goal state can be reached through the new
vertex. Because of this it is not needed to use the goal as the new sample for a certain percentage of the
time. As a result of that it is also not needed to use a cutoff distance. The distance metric and sample
region are the same as that in the previous section.
61
The simulation done in this section is to test whether the performance of the greedy RRT* algorithm is
good enough to be used for the simulation with the obstacles. Therefore only one simulation is done that
should not take too long, since to evaluate the performance of the algorithm it is not necessary to do a
very long simulation.
Test the performance of the greedy RRT* algorithm
Settings
Value
Start
Goal
RRT* algorithm
[-0.88,0,0,0]
[0.88,0,0,0]
1000
50
6.6.3
# iterations
# nearest neighbors
Results
The simulation was ran for 1000 iterations; this took about 22 hours. The cost of the trajectory found
is given in Figure 6.12. The left figure shows the convergence over the whole of the simulation time and
the right figure has zoomed in on the first 6000 seconds. The right figure shows that after a trajectory
to the goal was found, it was improved four times, but for the last 20 hours no improvement was made
anymore. This is caused by the fact that the chance of sampling a state that can be used to improve the
trajectory becomes increasingly small when the trajectory gets closer to the optimal trajectory.
Figure 6.12: Cost of best trajectory found so far. With left the plot of whole the simulation time and at
the right it is zoomed in on the first 6000 seconds.
The final trajectory found has a cost of 0.606 [s], which is still 9.13 % away from the optimal trajectory.
This is still not good enough. It is, however, an improvement compared to the standard algorithm used
in the previous section. The result obtained with that algorithm after 22 H was a trajectory of 0.997 [s].
To inspect the performance of the algorithm and the effects that temper this, a record is kept of all the
calls to the steering function and whether the function returned a feasible trajectory. In total 46547 calls
to the steering function were made and only 1900 times a feasible trajectory was returned. So for 4.08
62
% of the times the steering function returns a feasible trajectory. This is very low and is probably the
reason that the algorithm still does not function very well.
Record has also been kept of how many times vertices in the tree are optimized by a connection from the
newly added vertex. In total this is tried 6962 times and in 136 cases this was successful. This is 1.95
%, which seems very low, but if it is considered that the steering function only has success for 4.08 %
of the time, 1.95 % is actually quite good. If the total number of calls to the steering function (46547)
is compared to the number of calls to the steering function that are used to optimize vertices in the
tree (6962), it is clear that not much time is spend on optimization of the tree. About four times more
connection attempts are done while trying to connected new vertices in an optimal way to the tree, than
connection attempts that are used to optimize vertices in the tree. This is probably caused by the fact
that a lot of the new sampled states are not reachable at all given the limits in torque and angle. The
reason for this is that the sampling region that is used is a hyper rectangle in the 4D space, but the real
sampling region of states that can be reached given the joint limits is probably an odd abstract shaped
region.
6.6.4
Conclusion
The first thing that can be concluded is that exploiting the relative open workspace is successful since
this results in a better convergence of the RRT* algorithm. Secondly, it can also be concluded that
performance of the RRT* algorithm is still not good enough, since it takes 22 hours to get within 10 % of
the optimal trajectory. Since the greedy RRT* performs better than the standard RRT* for the problem
that is dealt with in this thesis, the greedy RRT* will be used in simulations in the remainder of this
chapter.
The main reason for the bad performance of the algorithm is the fact that the steering function only
returns a feasible trajectory for 4.08 % of the times. This obviously is too low. One reason for the steering
function to fail so often is that new states are sampled that are not reachable at all given the limits on
the maximum angles of the resonating arm. Next to that the steering function itself also seems part of
the problem.
6.7
6.7.1
Adjusting the Steering Function
Introduction
The RRT* algorithm with the adjustment as described in the previous section is still not performing
well enough to use it to find the time-optimal trajectories on the Resonating Arm around obstacles. It
was identified that the main reason for this bad performance is the fact that the steering function fails
too much. Part of that problem is caused by the fact that the sampling region contains states that are
unreachable. It is quite a difficult problem to obtain an accurate sampling region and describe it in such
a way that random samples can be taken from this region. Thus it seems more cost-effective to focus on
improving the steering function. Therefore this will be the focus of this section.
The steering function that was used in the previous sections will be compared to other steering functions
to see if there are better alternatives. The focus of this thesis is not on numerical optimal control algorithms, thus the steering function must be relatively simple to implement. The steering functions that
will be compared are numerical optimizations in which both a piecewise constant and a piecewise linear
control signal will be optimized to obtain a feasible trajectory. Also a number of other approaches were
tried, but these were omitted after some quick tests. These approaches were:
• Genetic Algorithm with mixed integer programming [30]: was used to obtain bang-bang steering
function with greater number of switches. The solution did not converge quickly enough.
63
• GPOPS [37]: uses polynomials to describe control input, and is therefore not good for bang-bang
control signals.
• DYNOPT [29]: a set of MATLAB functions for determination of optimal control trajectory. Only
works with ODE and not with DAE.
• ACADO Toolkit [14]: software environment and algorithm collection for automatic control and
dynamic optimization. Could not deal with the sign functions that result from the friction model.
6.7.2
Method
This section describes three different tests that are done. In the first test six steering functions are
compared. In the second test it is tested whether the algorithm that performed the best in the first test,
returns trajectories that are close to the time-optimal trajectories. In the third test it is tested which
percentage of the optimal trajectories crosses the joint limits.
Comparing Six Different Steering Functions
The steering functions that will be compared are actually three different optimization approaches. For
each of them two different settings are used, which makes six different steering functions in total. As
already mentioned in the introduction of this section, the steering function that was derived in Section
6.3 will be used as well as two optimizations whereby for one a piecewise constant and for the other a piecewise linear control signal are used. For the piecewise constant control ten control intervals are used and
for the piecewise linear control ten control points. This results in 21 design variables for the optimization
since there are two actuators and an extra design variable is used to scale the control intervals/distance
between the control points. 21 design variables may seem to be too much, but it is needed to have enough
control intervals/points so that the control can be approximate good enough.
The different steering function that will be compared are the following:
• Bang-Bang with three switches as in Section 6.3.
• Bang-Bang with three switches as in Section 6.3 but now with GlobalSearch function of MATLAB
[30] for all of the 32 optimizations.
• Piecewise Continuous control signal with ten control intervals of which the length is scaled with an
extra design variable. The GlobalSearch function of matlab is used and a step size of 0.01 [s] for
the integration of the equations of motion.
• Piecewise Continuous control signal with ten control intervals of which the length is scaled with an
extra design variable. The GlobalSearch function of matlab is used and a step size of 0.001 [s] for
the integration of the equations of motion.
• Piecewise Linear control signal with ten control points of which the time between these points is
scaled with an extra design variable. The GlobalSearch function of matlab is used and a step size
of 0.01 [s] for the integration of the equations of motion.
• Piecewise Linear control signal with ten control points of which the time between these points is
scaled with an extra design variable. The GlobalSearch function of matlab is used and a step size
of 0.001 [s] for the integration of the equations of motion.
The steering functions will be compared for hundred different sets of start and goal states. These sets are
generated by forward integration of the equations of motion with random controls. So it is guaranteed
that for all sets feasible trajectories exists. For all these six steering functions MATLABS constrained
optimization function fmincon is used as the optimization algorithm.
64
1. Bang-Bang with three switches (BB)
Settings
Value
number of runs
optimizations per run
optimization
100
32
t1 , t2 , t3 , t4
Fval = ||xgoal − xstart || + 0.1(t1 + t2 + t3 + t4 )
[0-1],[0-1],[0-1],[0-1]
0.1, 0.1, 0.1, 0.1
0.01
active-set
10−4
fmincon
8x4
design variables
cost function
bounds
initial guess
h
algorithm
TolFun
2. Bang-Bang with three switches and GlobalSearch (BB-GS)
Settings
number of runs
optimizations per run
optimization
fmincon with GlobalSearch
Value
8x4
design variables
cost function
bounds
initial guess
h
algorithm
TolFun
100
32
t1 , t2 , t3 , t4
Fval = ||xgoal − xstart || + 0.1(t1 + t2 + t3 + t4 )
[0-1],[0-1],[0-1],[0-1]
0.1, 0.1, 0.1, 0.1
0.01
active-set
10−4
3. Piecewise Constant control input with h = 0.01 (PWC-1)
Settings
number of runs
optimization
fmincon with GlobalSearch
Value
design variables
cost function
bounds
initial guess
h
algorithm
100
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.01
active-set
65
4. Piecewise Constant control input with h = 0.001 (PWC-2)
Settings
Value
number of runs
optimization
100
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.001
active-set
fmincon with GlobalSearch
design variables
cost function
bounds
initial guess
h
algorithm
5. Piecewise Linear control input with h = 0.01 (PLC-1)
Settings
number of runs
optimization
fmincon with GlobalSearch
Value
design variables
cost function
bounds
initial guess
h
algorithm
100
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.01
active-set
6. Piecewise Linear control input with h = 0.001 (PLC-2)
Settings
number of runs
optimization
fmincon with GlobalSearch
Value
design variables
cost function
bounds
initial guess
h
algorithm
100
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.001
active-set
Optimality of the Trajectories found by the Steering Function
It was already announced that the optimality of the trajectories of the best performing steering function
will be checked. This is done by comparing the results of the simulations that were described here above
with new simulations. These new simulations involve again a piecewise continuous and linear control
profile with ten segments/points that is optimized. Now, however, a multi start approach is followed
with 1000 start points. This is done for the first 10 sets of start and goal states of the simulations
describe here above. After that a simulation is done with the best performing of piecewise continuous
and linear control. This is done with a multi start with 5000 start points. This is done for the first two
sets of start and goal states of the simulations describe here above.
66
1. Piecewise continuous control input with h = 0.001 (PWC-MS1)
Settings
Value
number of runs
optimization
10
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.001
active-set
1000
fmincon Multistart
design variables
cost function
bounds
initial guess
h
algorithm
startpoints
2. Piecewise continuous control input with h = 0.001 (PWL-MS1)
Settings
number of runs
optimization
fmincon with Multistart
Value
design variables
cost function
bounds
initial guess
h
algorithm
startpoints
10
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.001
active-set
1000
3. Piecewise continuous/linear control input with h = 0.001
Settings
number of runs
optimization
fmincon with Multistart
Value
design variables
cost function
bounds
initial guess
h
algorithm
startpoints
10
u1,1...10 , u2,1...10 , t
Fval = ||xgoal − xstart || + 5t
[-Tmax,1 - +Tmax,1 ],[-Tmax,2 - +Tmax,2 ],[0.01 - 0.2]
0, 0, 0.1
0.001
active-set
5000
Percentage of Trajectories that Crosses Joint Limits
The simulations described here above where done to find a good steering function and to check how close
the optimal trajectories of that steering function are to the optimal. These simulations were done without
checking for joint limits. That is why the simulations that are described here beneath are done. For these
simulations 100 sets of start and goal states are created with forward integration of the equations of motion
for random controls. This is done for four different maximum time lengths of this forward integration.
In that way it is checked which percentages of the trajectories fails because a joint limit is hit and it is
also checked what the effect is of shorter distances between the start and goal states.
67
1. Bang-Bang with three switches (BB) with check for joint limits
Settings
Value
maximum time lengths
of start-goal sets
number of runs
optimizations per run
optimization
2 / 1 / 0.5 / 0.2 [s]
fmincon
6.7.3
8x4
design variables
cost function
bounds
initial guess
h
algorithm
TolFun
100 for all four sets
32
t1 , t 2 , t 3 , t 4
Fval = ||xgoal − xstart || + 0.1(t1 + t2 + t3 + t4 )
[0-1],[0-1],[0-1],[0-1]
0.1, 0.1, 0.1, 0.1
0.01
active-set
10−4
Results
Comparing Six Different Steering Functions
The results of the first test are given in Figure 6.13. This figure shows three bar charts which are the
average results of 100 runs of the different steering functions. The most left chart contains the average
time of the trajectories found by the steering functions. The middle chart contains the percentage of
success of the steering function and the right chart shows the average calculation time that was needed
by the algorithm.
Figure 6.13: Different steering functions compared on basis of average time of trajectories found, on basis
of the percentage of the time that an trajectory is found and on basis of the calculation time. BB =
bang-bang control, BB-GS = bang-bang with global search, PWC-1 = piecewise constant with h=0.01,
PWC-2 = piecewise constant with h=0.001,PWL-1 = piecewise linear with h=0.01, PWL-2 = piecewise
linear with h=0.001.
The left chart shows that best trajectories where found with the bang-bang steering functions. The one
with GlobalSearch was slightly better. The piecewise continuous returned significantly worse trajectories
68
and the piecewise linear did even worse. The smaller step size (h = 0.001 instead of h = 0.01) for the
integration of the equations of motion resulted in worse behavior.
The middle chart shows that both the bang-bang steering functions found in 100% of the cases a trajectory. With only 100 sets of states it can not be concluded whether bang-bang control with three switches
is enough to reach every state when joint limits are neglected, but this result does suggest that. The other
steering function perform also quite good with at least a success rate of 94%. The differences between
the success percentage are too small for further conclusions.
The chart of the calculation time shows a huge difference between the steering functions. The piecewise
constant and linear with an integration step size of 0.001 [s] show a calculation time of almost 400 [s] per
set states. This is about a factor 8 more than the same steering function with an integration step size
of 0.01 [s]. The bang-bang without GlobalSearch only has a average calculation time of about 3 [s] and
with GlobalSearch around 93 [s]. So there is quite some difference between the average calculation time.
In general it can be concluded that bang-bang steering function without GlobalSearch perfroms best.
Since it finds almost the best trajectories at the lowest calculation costs. This is the steering function
that will be used for the next two tests. Another general conclusion that can be made is that an integration
step size of 0.01 [s] is better for both the piecewise constant and linear steering function.
Optimality of the Trajectories found by the Steering Function
Figure 6.14 shows the results where the trajectory found by the bang-bang steering function is compared
with two other algorithms that used a multi-start approach with 1000 and 5000 start points. In the left
bar chart the bang-bang steering functions is compared with both the piecewise constant (PWC-MS1)
and linear (PWL-MS1) multi-start steering functions with 1000 start points. The left chart shows ten
runs and only at the second run the piecewise constant (PWC-MS1) steering function finds a slightly
better trajectory, while the computational cost of the multi-start algorithm is a factor 3660 greater.
Figure 6.14: Optimal trajectory found by bang-bang steering function compared for 10 runs with piecewise
constant (PWC-MS1) and piecewise linear (PWL-MS1) multi-start optimization approach with 1000
start points. b) Optimal trajectory found by bang-bang steering function compared with 1000 piecewise
constant multi-start with 1000 start points (PWC-MS1) and piecewise constant multi-start with 5000
start points (PWC-MS2) for two runs.
69
In the right chart the first two runs are plotted again. But the piecewise linear (PWL-MS1) is left out
and the piecewise constant multi-start with 5000 start points (PWC-MS2) is plotted next to it. For the
first run the trajectories have almost the same cost, but the bang-bang steering function (BB), for the
second run both the piecewise constant multi-start steering functions are better but the 5000 multi-start
is only 3% better, while the computational cost are a factor 1.22e4 greater.
Thus it can be conclude that the trajectories returned by the bang-bang steering function are normally
very close to the time-optimal trajectories and again it is shown that the bang-bang steering function is
computational very efficient.
Percentage of Trajectories that Crosses Joint Limits
With the previous tests it was shown that the bang-bang steering function is actually quite good since it
returned (near) time-optimal trajectories and it found always a trajectory if crossing of the joint limits
was not prohibited. In the previous section it was found that only in 4.08% of the time a feasible trajectory was returned, so this should at least partly be caused by crossing of the joint limits.
Figure 6.15: Bang-bang steering function success percentage for 100 runs, with different maximal time
between the sets of vertices where between the trajectories are calculated.
It was tested what the success percentage is when crossing of the joint limits is not allowed. The states
where between the trajectories should be returned were generated by forward integration of the equations of motion. This integration was done for 20 time intervals with random continuous control for each
interval. The length of each interval was also varied. The maximum duration of the time intervals was
set to four different value which correspond to the four bars in Figure 6.15. The maximum duration of
the 20 time intervals were: 0.1, 0.05, 0.025 and 0.01 [s].
Figure 6.15 shows that for a maximum time duration of 2 [s], the success percentage is only 14 %. For
1 [s] this is 18%, which is still quite low. For maximum 0.5 [s] this is 41 % which is much better. For
maximum 0.2 [s] it is 82 %. This shows that the closer the states are (in time) the better the change on
a successful trajectory between them.
6.7.4
Conclusion
The tests performed in this section showed that the bang-bang steering function as described in Section 6.3 is quite good. It finds better trajectories than the other steering functions and does this at least
109 times faster. It was shown that the bang-bang steering function finds (near) time-optimal trajectories.
70
The last test however, showed that the (near) time-optimal trajectories are likely to cross one of the joint
limits. For two random states in the state space the change on a successful trajectory is only 14%. When
the distance in time between two states is smaller the chance on a successful trajectory increases. So
the implementation of a cutoff distance is needed to increase the success percentage, but than a distance
metric is needed that gives a good representation of the actual distance in time between two states.
6.8
Adjusting the Distance Metric
6.8.1
Introduction
In the previous section it was concluded that a better distance metric is needed so that success percentage of the steering function can be increased. This is needed because otherwise a lot of connection
attempts are done that are doomed to fail and the connection attempts that fail cost the same amount
of time as the ones that succeed. Thus in this section the focus will lay on finding a better distance metric.
Figure 6.16: Trajectory in configuration space between two states that are very close in terms of euclidean
distance.
The distance metric that is used here was the euclidean distance in state space. This is the most simple
distance metric, but it also quite bad since for two points in state space that are very close to each other
according to metric a long trajectory can be needed to get from one to the other. Figure 6.16 shows such
a situation where by the distance is only 0.05, but a trajectory of 0.4871 [s] is needed. Thus the euclidean
distance is not a good distance metric. This is a well known problem [13], but there is no simple solution
to this problem. It is even stated by LaValle and Kuffner [24] that:
“In theory, there exists a perfect metric (or pseudometric due to asymmetry) ... This
is the optimal cost (for any criterion, such as time, energy, etc.) to get from one state to
another. Unfortunately, computing the ideal metric is as difficult as solving the original
planning problem.”
Thus a pseudometric is needed since the cost-to-go function that is needed is not symmetric and therefore
does not meet the formal requirements for a metric. The problem of finding such a good pseudometric
is quite challenging, but an improvement of the pseudometric does lead to substantial performance benefits [24]. The goal in this section will be to find a better, but relative simple pseudometric.
71
6.8.2
Method
In order to simplify the problem of finding a good pseudometric, first a one dof model is observed. This
model contains a link that can rotate around a joint (comparable to the one dof Resonating Arm). In
Figure 6.17a the time-optimal trajectory between S1 and S2 for this model is plotted in state space. In
Figure 6.17b a state (S1 ) is located. This state can be reached from whole the state space, but as was
shown in Figure 6.16 from some states a long trajectory with forward and backward motions is needed.
The states from which S1 can be reached without forward and backward motion, are the state within
the blue dashed region. This blue dashed region is bounded by the boundaries of the state space and
by the maximum acceleration and maximum deceleration curves of the system. The states within this
region have relative low cost trajectories toward S1 . The states with negative velocity, need to move
backward and forward to reach S1 , so this costs in general more time. The states with positive velocity
that are at the right of the blue dashed region need to move forward-back-forward to reached S1 , so this
will cost even more time normally. A remark has to be made that the velocity of the state is also of great
importance, since this determines the time that is needed to slow down and reverse the direction of motion.
a)
b)
Figure 6.17: a) Time-optimal trajectory in state space between state S1 and state S2 . b) Maximum
positive and negative torque lines through state S1 . These lines together with the limits of state space
create a blue dashed region from where S1 can be reached with only forward motion. The res dashed
region contains the states that can be reached from S1 with only forward motion.
The same reasoning can be applied to the states that can be reached from state S1 . To this case the
red shaded region applies. This region is also bounded by the maximum acceleration and deceleration
curves of the system. So a good pseudometric would takes these curves into account. The problem now
is that for the two dof case the state space is 4D and the steepness of these curves dependents on the
other dof since the two dof influence each others inertia. So these curves become complex 4D surfaces
for the two dof case. This is comparable to the problem of defining an accurate sampling domain which
was discussed in Section 6.6. So it is very difficult to define a good pseudometric analytically. A solution
could be to learn a pseudometric by some learning algorithm. This however, is not that straightforward
since the trajectory from one state to another is normally different than the return trajectory. So in that
case an algorithm that can deal with that should be used or multiple pseudometrics should be learned
for different cases. So this is not a good approach either.
The simplified approach that is followed in this research is that the pseudometric for the dof are decoupled. In that way for both dof a pseudometric can be created individual and the total pseudometric can
be the sum of these two. Figure 6.17b is used again as the situation for one dof. The steepness of curve it
determined by the other dof, so it could be chosen to use the steepest the flattest, or the average curve.
Since a curve line as a boarder is difficult to use, it is decided to use a vertical line instead. For the case
were it is the goal to reach S1 , the states to the right of the state get a penalty. When it is the goal to reach
72
a certain state from S1 , the states to the left of this state get a penalty. When the state S1 has a negative
velocity, this is precisely the other way around. When the velocity is zero no penalty is given to any state.
This penalty is given on top of the euclidean distance. The penalty is implemented in two ways:
• Delete all the nodes that are at the wrong size of the state (infinite high penalty).
• A combined penalty based on the angle with the vertical line through the state and the velocity of
the corresponding state. Figure 6.18 shows how this pseudometric works. The greater the angle
α the greater the penalty varying form 1.0 to a certain maximum. The velocity ∆ω is scaled with
the maximum possible velocity and results in a penalty varying between 1.0 to a certain maximum.
The product of these two penalties is the total penalty that is give to a certain state. This penalty
value is multiplied with the scaled euclidean distance and this gives the new distances.
Figure 6.18: The ‘complex’ penalty pseudometric uses the angle α and the angular velocity ∆ω as
measures for the penalty when the distance from S2 to S1 should be calculated. This penalty is multiplied
with the euclidean distance to come to a distance. The greater α and ∆ω are, the greater the trajectory
that has to be followed to reach S1 and thus the greater the penalty.
Now there are three different (pseudo)metrics that can be used; the euclidean distance, the pseudometric
with the ‘simple’ penalty and the pseudometric with the ‘complex’ penalty at which the angle to the goal
state and the scaled angular velocity are used. The (pseudo)metrics will be tested in the following way:
1. A tree structure with 2000 vertices is created.
2. 20 random states (that are reachable) are selected.
3. For each of these 20 states the 50 Nearest Neighbors are returned by the different (pseudo)metrics.
4. For each of these 50 NN the steering function is called to return the optimal trajectory from these
50 NN to the corresponding state.
5. These results are stored and will be used to compared the different distance metrics.
Three tests will be done:
1. The euclidean distance will be tested for six different scaling factors to check which gives the best
result. These scaling factor are used to scale the angular velocity w.r.t. the angle. The scaling
factors that are used are: 0.01, 0.05, 0.1, 0.2, 0.5, 1. This test is done since the scaled euclidean
distance is also used in the two pseudometrics. In this way the best performing scaling can be used
for the other two pseudometrics.
2. The complex penalty pseudometric is tested for six different settings of the maximum penalties.
These setting is are 1.5/1.5, 2.0/2.0, 2.5/2.5, 3.0/3.0, 2.0/1.5, and 1.5/2.0. These numbers are the
maximal penalty factors for the angle and for the angular velocity.
73
3. The best performing scaling of the euclidean distance, the best performing complex penalty pseudometric and the simple penalty pseudometric are compared. To check which (pseudo)metric can
be used best.
6.8.3
Results
The result of the euclidean distance with different scaling factors is given in Figure 6.19. This figure
shows that the percentage of success increases while the scaling of the angular velocity becomes smaller.
The average cost of the trajectories seems to decreases while the scaling of the angular velocity becomes
smaller, but the lowest cost trajectories were found with the 0.05 as scaling factor. So it is hard to decide
whether a scaling factor of 0.01 or 0.05 is better, since both the percentage of success and the cost of the
trajectories is of importance. The scaling factor of 0.01 is chosen to be used for further simulations.
Figure 6.19: a) Average percentage of success of the trajectories between 20 different states and their 50
nearest neighbors for different scaling factors of the angular velocity. b) Average cost of these trajectories
for different scaling factors of the angular velocity.
The results of the simulations with the complex penalty pseudometric can be found in Figure 6.20. These
two bar charts show that the best setting for this metric is with both penalties maximum three. The bar
charts show that the pseudometric becomes better if the penalties are increased. The higher the penalties
are, the more the ‘complex’ penalty pseudometric approaches the ‘simple’ penalty pseudometric. The
best result of the pseudometric of Figure 6.20 is compared with the ‘simple’ penalty pseudometric and
the euclidean distance in Figure 6.21. This figure shows that the ‘simple’ penalty pseudometric is by far
the best, since it has both a higher success percentage and a lower average cost of the trajectories found.
Therefore this pseudometric will be used in the remainder of this thesis.
74
Figure 6.20: a) Average percentage of success of the trajectories between 20 different states and their
50 nearest neighbors for the ‘complex’ penalty pseudometric with different penalty factors. The first
numbers denote the penalty on the angle with the goal state and the second the penalty on the angular
velocity. b) Average cost of these trajectories with different penalty factors.
Figure 6.21: a) The euclidean distance and the ‘simple’ and ‘complex’ penalty function are compared. The
‘simple’ penalty function which is labeled as ‘cutoff’ in the plot performs the best. Its success percentage
is 79.3% while that of the euclidean distance is 77.9% and that of the ‘complex’ penalty function (labeled
as ‘penalyt’) is only 77.6%. b) The ‘simple’ penalty function performs the best in terms of average time
of the trajectories.
75
6.8.4
Conclusion
The goal of this section was to find a distance metric that represented the cost-to-go from one state to
another better then the euclidean distance. Due to the asymmetry of such a representation not a formal
metric, but a pseudometric is needed. The problem of finding the ideal pseudometric is very difficult and
is as difficult as solving the original problem. Therefore a good but relative simple pseudometric is needed.
In this section the two dof were decoupled to simplify the problem. Two pseudometrics were invented
that penalize states that are at the wrong’ site of the state for which the nearest neighbors are sought.
The ‘simple’ penalty pseudometric simply deletes these states and uses the euclidean distance for the
other. The ‘complex’ penalty pseudometric penalized the states on basis of the angle with the vertical
line through the state and on the angular velocity. This penalty is multiplied with the euclidean distance.
Simulations with different scalings of the euclidean distance showed that a scaling factor of 0.01 gives the
highest percentage of success and a scaling factor of 0.05 gave the best trajectories. The last first was
chosen to be used. Simulations of the complex penalty pseudometric with different setting show that the
pseudometric performs better when the penalties are higher. Simulations that compared the euclidean
distance and both the penalty pseudometrics showed that the simple penalty pseudometric performed
better than the two other metrics. Therefor this pseudometric will be used in the remainder of this thesis.
6.9
6.9.1
Final Simulation with the RRT* Algorithm with Improved
Pseudometric
Introduction
This section contains the simulations with the RRT* algorithm which was improved and tailored for the
two dof Resonating Arm. Some small adjustment to the algorithm will be tested and this will result in
the final version of the RRT* algorithm used in this thesis, since no time is left to further improve the
algorithm.
This final version of the RRT* algorithm is tested first without the obstacle to check the performance
of it. After that simulations are done with the obstacle presented in the workspace. The location of the
obstacle will be varied to investigate the effect of the obstacle on the time-optimal trajectories.
6.9.2
Method
The previous simulations with the RRT* algorithm were done in Section 6.6 with the greedy RRT* algorithm. In the previous section a new pseudometric was introduced that should improve the performance
of the (greedy) RRT* algorithm. Therefore first a simulation will be done to compare the results with
the new pseudometric with that of Section 6.6.
After that simulations will be done to test the effect of a smaller sampling region and the effect of a cutoff
distance. A last simulation without obstacle will be done to check the performance of the final algorithm
by comparing the trajectory that is found with the time-optimal one from Section 6.3.3.
The final simulations are done for four different locations of the obstacle. These simulations are used to
investigate the effect of the obstacle on the time-optimal trajectories.
The Performance of the RRT* Algorithm with the Improved Pseudometric
This simulation will be done to compare the performance of the RRT* algorithm with the improved
pseudometric with that of the euclidean distance metric. This will be done by comparing the success
percentage of both the simulations. The results of this simulation will also be used to compared it with
the next two simulations where the sample region is adjusted and where a cutoff distance is implemented.
76
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0,0,0)
(0.88,0,0,0)
5000
50
130π
[(− π3 , π3 ), (− 130π
180 , 180 ), (−15, 15), (−20, 20)]
# iterations
# nearest neighbors
sample domain
Effect of a Smaller Sampling Region on the Performance of the RRT* Algorithm
The sampling region is limited by the maximum angles which are a property of the system that can
not be changed. The sampling region is also limited by the maximum angular velocities for both joints.
These velocities where determined by applying random controls to check what the maximum velocities
are that can be reached with the given joint limits. These maximum velocities (ω1,max = ± 15 [rad/s],
ω2,max = ±20 [rad/s]) are much higher than the maximum velocities that are reached during the timeoptimal motion. Therefore the angular velocity limits of the sampling region are set to: ω1,max = ±9
[rad/s] and ω2,max = ±11 [rad/s]. The simulation that is done is described in detail here beneath.
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0,0,0)
(0.88,0,0,0)
5000
50
130π
[(− π3 , π3 ), (− 130π
180 , 180 ), (−9, 9), (−11, 11)]
# iterations
# nearest neighbors
sample domain
Effect of a Cutoff Distance on the Performance of the RRT* Algorithm
It was concluded in Section 6.8.4 that a cutoff distance is needed to improved the success percentage of
the steering function. Therefore a cutoff distance is implemented here. The results of this simulation will
be compared with that of the previous simulation. The sampling region is the same as in the previous
simulation. The details of the simulation are:
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0,0,0)
(0.88,0,0,0)
5000
50
130π
[(− π3 , π3 ), (− 130π
180 , 180 ), (−9, 9), (−11, 11)]
1.0
# iterations
# nearest neighbors
sample domain
cutoff distance
Finding a (Near) Time-Optimal Trajectory
Now that the algorithm is adjusted to its final form it is necessary to check whether the RRT* algorithm
in its current settings is capable to find trajectories that are close to the time-optimal one. Therefore the
result of this simulation will be compared to the time-optimal trajectory that was found in Section 6.3.3.
In order to find a good trajectory, the number of nearest neighbors is increased from 50 to 100. In this
way the algorithm optimizes the vertices of the tree better at a cost of a longer computation time. The
details of the simulation are given in the table here beneath.
77
Settings
Value
Start
Goal
RRT* algorithm
(-0.88,0,0,0)
(0.88,0,0,0)
5000
100
130π
[(− π3 , π3 ), (− 130π
180 , 180 ), (−9, 9), (−11, 11)]
1.0
# iterations
# nearest neighbors
sample domain
cutoff distance
Find the (Near) Time-Optimal Trajectories Around an Obstacle
Finding the time-optimal trajectories around obstacles is the main challenge of this thesis. The simulations that will be done to obtain these trajectories are described here. It is, however, the question of
the simulations will result in (near)time-optimal trajectories, since the calculation time is limiting the
convergence of the solution.
The obstacle is a circle which is described by the position of its center and its radius. The radius is
the same for all the simulations and is 0.2 [m] + 0.05 [m] (safety region) = 0.25 [m]. This obstacle is
positioned on four different locations to check the effect of it on the trajectory that the arm describes.
These locations are: [0.8, 0], [0.9, 0], [0.8, 0.2] and [0.8, -0.2].
The settings of the algorithm are the same as with the previous simulation except for the sampling region.
The time-optimal motions for the system with the obstacle in the workspace is not known. Therefore
the maximum angular velocities are set a bit higher, because it cannot be said on forehand that the
maximum angular velocity for the workspace with obstacle will be equal of lower to that of the situation
without the obstacle. The maximum negative angular velocity of the upper arm is set lower, because the
upper arm does not have to move backward very fast. This results in a non-symmetric sampling region,
but this is no problem. This could also be done for the earlier simulations, but for the sake of simplicity
a symmetric sampling region was used in those simulations. The resulting details of the four simulations
are given here beneath.
Settings
Value
Start
Goal
Obstacle
(-0.88,0,0,0)
(0.88,0,0,0)
[0.8, 0]/[0.9, 0]/[0.8, 0.2]/[0.8, -0.2]
0.25 [m]
5000
100
130π
[(− π3 , π3 ), (− 130π
180 , 180 ), (−5, 12), (−14, 14)]
1.0
RRT* algorithm
6.9.3
position
radius
# iterations
# nearest neighbors
sample domain
cutoff distance
Results
The Performance of the RRT* Algorithm with the Improved Pseudometric
The cost of the best trajectory found during the simulation with the new pseudometric is given in Figure
6.22. This result is compared with that of the greedy RRT* algorithm with the euclidean distance metric
of Section 6.6. The best trajectory found is comparable, 0.606 [s] with the old metric and 0.6196 [s] with
the new pseudometric. This does not say much, because the algorithm uses random samples, so with one
lucky sample a much better trajectory can be found. The calculation time is also comparable 22 hours
with the old metric compared to 17 hours with the new. The difference however is the success percentage
of the connection attempts. This is 6.00% with the new pseudometric and 4.08% with the old metric. So
this is an improvement of about 50% which is quite substantial, but 6.00% is still very low.
78
Figure 6.22: Best trajectory found over time with the RRT* algorithm with improved pseudometric.
Effect of a Smaller Sampling Region on the Performance of the RRT* Algorithm
The second simulation was done with a smaller sample region. In that way less states should be sampled
that are not reachable, so that the success percentage should increase. The result of this indeed shows a
higher success percentage of 18.5% which is three times better than the result of the previous simulation.
In Figure 6.23 the cost of the best trajectory found over time is given. The final trajectory found is
comparable to that of the previous simulation. The same applies to the calculation time of 16 hours.
Figure 6.23: Best trajectory found over time with the RRT* algorithm with improved pseudometric and
adjusted sample region.
Effect of a Cutoff Distance on the Performance of the RRT* Algorithm
The next adjustment was the implementation of a cutoff distance. With the new pseudometric this should
increase the success rate of the steering function. That is indeed the case, since the success percentage is
27.3%. The calculation time of 2.5 hour is also much lower than that of the previous simulation. The cost
of the best trajectory found over time is given in Figure 6.24. The cost of the final trajectory is 0.5934
[s] which is better than all previous simulations. So the RRT* algorithm with these settings performs
much better than the previous implementations. Therefore the next simulations will be done with these
settings.
79
Figure 6.24: Best trajectory found over time with the RRT* algorithm with improved pseudometric,
adjusted sample region and cutoff distance of 1.
Finding a (Near) Time-Optimal Trajectory
The last simulation without obstacle is done to check how good the performance of the current implementation of the RRT* algorithm is. The number of nearest neighbors is increased from 50 to 100. This
gives the result as given in Figure 6.25. The cost of the final trajectory is 0.5682 [s] which is 2.32%
more than the (near)time-optimal one found in Section 6.3. The calculation time is increased from 2.5
to almost 43 hours and the success percentages has decreased to 22.1 %. A final trajectory within 2.5%
of the optimal trajectory with 43 hours of calculations is acceptable. Therefore this are the settings that
are used for the simulations with the obstacle.
Figure 6.25: Best trajectory found over time with the RRT* algorithm with improved pseudometric, adjusted sample region and cutoff distance of 1. Simulation with 5000 iterations and 100 nearest neighbors.
Find the (Near) Time-Optimal Trajectories Around an Obstacle
In the method it was mentioned that simulations where done for four different locations of the obstacle.
The first simulation, of which the result is given in Figure 6.26 and 6.27, is done with an obstacle in
front of the RA. The distance between the base of the RA and the obstacles is 0.55 [m]. The states w.r.t.
time and the control w.r.t. time are given in Figure 6.26. The control (Figure 6.26b) contains eleven
switches in total and the result of this is quite a spiky angular velocity profile (Figure 6.26a) for both dof.
80
Therefore the best trajectory found is probably not time-optimal and not even near time-optimal. This
is also suggested by the trajectory in configuration space which is in Figure 6.27a, since this trajectory
is not smooth. The trajectory in workspace which is given in Figure 6.27b does show that the RA moves
very close around the obstacle, so that would suggest that this trajectory is not that bad. Figure 6.27b
shows that the RA performs a ‘push-motion’ while Figure 6.27a shows that a ‘pull-motion’ is also possible
since the trajectory can also go along the bottom of the obstacle. The time of the best trajectory found
is 0.9209 [s], which is much longer than the time-optimal trajectory which is 0.5553 [s].
Figure 6.26: a) States of the RA against time for the best trajectory found around an obstacle located in
front of the RA at a distance of 0.55 [m]. b) Torque profile of the two actuators for the trajectory around
the obstacle.
Figure 6.27: a) The trajectory in the configuration space which goes around the obstacle (black dots).
b) The corresponding movement in the workspace with the RA that is connected to its base at [0,0] and
consists of the two blue links. The RA move from left to right around the obstacles (black dots).
81
For the second simulation the obstacle was placed again in front of the RA, but on a distance of 0.65
[m]. Figure 6.28b shows the control with seven switches. This results in a smoother angular velocity
profile, which can be seen in Figure 6.28a. The corresponding trajectories in configuration space and
workspace are given in Figure 6.29. The trajectory in configuration space is much smoother than that
of the previous simulation, which suggests that the cost of this trajectory is closer to that of the timeoptimal trajectory. The time of this trajectory is 0.6416 [s] which is about 15.5% higher than that of the
time-optimal trajectory without the obstacle. Thus this trajectory is probably near time-optimal and it
can be concluded that the location of this obstacle has less influence on the time-optimal trajectory than
the obstacle location of the previous simulation.
Figure 6.28: a) States of the RA against time for the best trajectory found around an obstacle located in
front of the RA at a distance of 0.65 [m]. b) Torque profile of the two actuators for the trajectory around
the obstacle.
Figure 6.29: a) The trajectory in the configuration space which goes around the obstacle (black dots).
b) The corresponding movement of the RA in the workspace.
82
For the third simulation the obstacle is shifted to the right with 0.2 [m] with respect to the location of the
obstacle of the first simulation. The control signals and states w.r.t. time are in Figure 6.30. The control
has six switches and the angular velocity profiles are relative smooth. The best motion found is again
the ‘push-motion’, but this is logical because the ‘pull-motion’ is not possible because of the location of
the obstacle. Figure 6.31a shows this since there is now way to go along the bottom of the obstacle. The
trajectory in configuration space goes quite loose around the obstacle, so the trajectory is probably not
near time-optimal. The time length of the trajectory is 0.7513 [s].
Figure 6.30: a) States of the RA against time for the best trajectory found around an obstacle located
to the right of the RA. b) Torque profile of the two actuators for the trajectory around the obstacle.
Figure 6.31: a) The trajectory in the configuration space which goes around the obstacle (black dots).
b) The corresponding movement of the RA in the workspace.
83
The last simulation is with the obstacle shifted to the left with 0.2 [m] compared to first simulation with
the obstacle. Figure 6.32a shows that the angular velocity of the first dof (ω1 ) crosses the zero line three
times, which means that the upper arm is moving forward and backward. This is in general not a very
fast movement, so the trajectory is probably not near time-optimal. This is also illustrated by Figure
6.33. Figure 6.33a shows a curve when the trajectory goes around the obstacle, which does not look quite
optimal. In the workspace plot of Figure 6.33b it is shown that the RA does not move very tight around
the obstacle, this is also an indication that the motion is not near time-optimal.
Figure 6.32: a) States of the RA against time for the best trajectory found around an obstacle located
to the left of the RA. b) Torque profile of the two actuators for the trajectory around the obstacle.
Figure 6.33: a) The trajectory in the configuration space which goes around the obstacle (black dots).
b) The corresponding movement of the RA in the workspace.
84
6.9.4
Conclusion
The simulation with the new pseudometric gave a better result than with the euclidean distance as metric.
The percentage of successful trajectories found by the steering function increased to 6% which is almost
50% better than the previous success percentage of 4.08%.
The knowledge of the time-optimal trajectory for the RA without obstacles was used to adjust the sampling region. This resulted in a success percentage of 18.5% which is an increase of more than 300%.
This suggests that the success percentage can be increased even further when a better sampling region
is used, because the current sampling region is still a hyperrectangle, while the sampling region should
be an abstract 4D shape.
Another improvement is the implementation of a cutoff distance. This result in less connection attempt
between states that are far away from each other. Such sets of states are more likely to result in trajectories with self-collision and therefore this improves the success percentage. This percentage is 27.3%
which is again an increase of 50%.
With the modifications discussed here above a simulation is done to test the performance of the algorithm.
A simulation is done with 5000 iterations and 100 nearest neighbors. This results in a simulation of 43
hours, which is long but feasible. The best trajectory found has a cost of 0.5682 [s] which only 2.32%
more than the cost of the time-optimal trajectory. So the RRT* algorithm with the current settings is
able to find a considerable good solution withing reasonable time.
The simulation with obstacles seems to result in trajectories that are not very close to the time-optimal
trajectories. It is, however, difficult to tell how close the trajectories are to the time-optimal since the
time-optimal trajectories are not known. To have more certainty about the quality of the trajectories,
longer simulations are needed. What can be concluded from the trajectories is that for obstacles in front
or to the right of the RA the ‘push-motion’ is faster. For trajectories at the left of the RA at a certain
point the ‘pull-motion’ becomes the faster one. It can also be concluded that a bang-bang controller with
at most eleven switches is sufficient to move around the obstacles which were used in this thesis. The
time-optimal trajectories for the RA seem to contain singular arcs when the RA has to move around
obstacles.
6.10
Conclusion
The two dof Resonating Arm is a complex dynamical system due to the coupling of the two dof. The
steering function that was used in this chapter relies on the optimization of a bang-bang controller with
three switches. This approach is chosen instead of a more general approach in which the control signal
consists of a number of points, functions or time intervals that are optimized. This is done because the
optimization of a bang-bang controller with three switches is a relatively cheap optimization and literature indicates that this could give good results.
The steering function was used to calculate the (near) time-optimal trajectories for the two dof RA. This
resulted in two distinct motion patterns of which one is slightly better. The best motion costs 0.5553 [s]
and is called the ‘push-motion’ since the lower arm is pushed in front of the upper arm. The ‘pull-motion’
took 0.5726 [s]. It cannot be guaranteed that the ‘push-motion’ is the fastest motion, but since no better
result could be found with GPOPS it is very likely that this motion is indeed the optimal one.
Simulations showed that the performance of the basic RRT* algorithm (as implemented in this thesis)
is very bad for the two dof RA. This was caused by the relatively slow steering function and the 4D
state space in which the trajectory has to be found. Therefore the algorithm is adjusted and a more
greedy variant was implemented. This variant tries to connect new samples directly to the start and goal
vertex. In this way the quality of the trajectory between start and goal can be improved faster when
the workspace is relatively ‘open’ (no obstacles). This resulted in a faster convergence of the optimal
85
trajectory. However, the algorithm was still to slow to be useful. Further inspection showed that the
steering function finds a feasible trajectory in only 4.08% of the cases. This is very low and probably the
reason for the bad performance.
The bang-bang steering function was compared with other steering functions by which a control signal
with ten piecewise linear or piecewise constant control intervals was optimized. These tests showed that
the bang-bang steering function performs very well. It is much faster than the other two approaches and
the trajectory it finds is almost always better. Simulation showed a 100 percent success rate of the bangbang steering function when there is no checking on crossing the joint limits. When the joint limits are
taken into account, this success rate drops enormously. Simulations that are done show that the success
rate is higher when the length of the trajectories is smaller. So it is needed to have a distance metric that
represents the distance in time between states better. In that way better nearest neighbors can be selected.
The problem of finding a good distance metric was shown to be very complex. Two new distance metrics
were created that decouple both dof. By both the two dof states that are on the ‘wrong side’ of the state
are penalized. Simulations showed that the more simple penalty function, by which states that are on
the ‘wrong side’ are deleted from the list of nearest neighbors, works the best. With this distance metric
the final simulations with the obstacle are done.
Simulations without obstacles showed that the new pseudometric performs much better than the euclidean distance, especially when the sample region is adjusted and when a cutoff distance is set. The result
of that is that the steering function did return a feasible trajectory in 27.3% of the cases, which is an
increase of 570% compared to the results of Section 6.6. With this adjustments the RRT* algorithm was
able to find a trajectory within 2.5% of the time-optimal trajectory within 43 hours.
The algorithm with the same settings was also used for the simulation with obstacles. The obstacle was
placed at four different locations in the workspace. The trajectory found did not seem to be near timeoptimal, but this is hard to conclude since the time-optimal trajectory is not known. The optimality of
the trajectories found can be verified by running the same simulations, but for much longer times. The
trajectories found with the RRT* algorithm did show that a finite number of switches is enough to move
around obstacles. It was also shown that the location of the obstacle determines whether a ‘push-motion’
or ‘pull-motion’ is better.
The RRT* algorithm used for the final simulations with obstacle was too slow to return (near)time-optimal
trajectories within the available time. It is however possible to obtain (near)time-optimal trajectories
when the simulation time is long enough. Another way to obtain (near)time-optimal trajectories is by
increasing the performance so that the (near)time-optimal trajectories are found within reasonable time.
The performance can be increased by generating better samples and by increasing the success percentage
of the steering function. Better samples can be obtained by creating a better sampling region and using
more advanced sampling techniques that do not take random samples. The success percentage of the
steering function can be increased by using an optimization algorithm that can deal with path constraints.
In that way joint limits can be incorporated as path constraints and no collision checking for self-collision
is needed anymore. A requirement is that this optimization algorithm is at least as fast as the current
optimization algorithm implemented.
86
Chapter 7
Conclusion and Recommendations
The Resonating Arm is an energy efficient robotic arm which is designed for pick-and-place tasks. It is
the goal of this thesis to find the time-optimal trajectories between the pick and place locations. The
research question is: “What are the time-optimal trajectories of the Resonating Arm?”. The time-optimal
trajectories for both the one and two dof RA can be found with a straightforward optimization, but when
obstacles are present in the workspace another approach is needed. The Rapidly-exploring Random Tree
Star (RRT*) algorithm was used in this research to find these time-optimal trajectories.
7.1
Conclusion
In Chapter 4 simulations were done for a linear one dof robotic arm that is comparable to the one dof
RA without the spring mechanism and friction. These simulations showed that the RRT* algorithm is
capable of finding time-optimal trajectories. For this system an analytical steering function existed which
gave results quite fast. However, the simulation times are still much longer than the time of the optimal
trajectory between the start and goal. The RRT* algorithm is therefore probably not fast enough to find
the time-optimal trajectories in real-time for a fast robotic arm. The time-optimal trajectory for this
linear robotic arm showed a symmetric curve in the state space, which is as expected. The trajectory
consisted of a maximum acceleration and maximum deceleration phase. The time of this trajectory was
0.3353 [s].
The simulations with the one dof Resonating Arm were more challenging. Two steering functions were
proposed and tested: an approximated analytical steering function and a numerical steering function.
The numerical steering function consisted of a bang-bang control that was optimized. This steering function performed better than the approximated analytical steering funtion, so a numerical steering function
was also used for the two dof RA. The time-optimal trajectory was not symmetrical anymore due to the
friction in the system. But as for the linear system, the arm is first accelerated with the maximum torque
and then decelerated with the maximum negative torque. The time of this trajectory is 0.3290 [s] which
is faster than that of the system without the spring mechanism. Thus the spring mechanism improved
the system in terms of speed.
For the two dof RA a numerical steering function was used that optimized a bang-bang control with three
switches. The steering function was also used ‘standalone’ to find the time-optimal trajectory for the
workspace without obstacles. This resulted in a motion whereby the end-effector leads the elbow. This
motion is labeled as the ‘push-motion’. The time of this trajectory is 0.5553 [s]. Also a slightly longer
motion was found whereby the elbow leads the end effector. This motion was labeled as the ‘pull-motion’.
It cannot be proved that the ‘push-motion’ of 0.5553 [s] is the time-optimal motion, but since no better
motion could be found with GPOPS [37] it was assumed that this is the time-optimal motion or at least
a near time-optimal motion.
The basic RRT* algorithm with the numerical bang-bang steering function was tested for the workspace
without obstacles. This resulted in a very bad performance. In order to improve the performance, the
RRT* algorithm was made more greedy so that it tries to connect every new state directly to the start
state and the goal state. This gave a slightly better result, but close inspection of the results showed
that only 4.08% of the times the steering function returned a feasible trajectory. In order to find a steering function that has a higher success percentage different steering functions were compared. Next to
88
the bang-bang steering function two numerical steering functions were tested that optimize a piecewise
continues/linear control signal. These tests showed that the bang-bang steering function did not always
return the time-optimal trajectory between two states, but it did perform much better than the other two
steering functions. It was found out that the bad performance of the steering function was the result of
time-optimal trajectories that cross one of the joint limits. The further two states are away, the greater
the chance that the time-optimal trajectory crosses one of the joint limits.
To overcome this problem, nearest neighbors that can be reached by short trajectories should be selected. This can be done by using a better distance metric since the euclidean distance does not represent
the cost-to-go between two states very good. Two pseudometric were created and tested. One of these
metrics performed significantly better than the euclidean distance. Therefore this pseudometric was used
for the rest of the simulations.
The RRT* algorithm with the new pseudometric was used for simulations. This resulted in a higher
success percentage of 6%. This was still too low, but a better sampling region and a cutoff distance were
implemented. This resulted in a success percentage of 27.3%. With these settings of the RRT* algorithm,
a simulation without obstacles was done at first. This resulted in a trajectory that was only 2.3% longer
than the time-optimal trajectory. This was considered to be good enough, so the same settings were used
for the simulations with the obstacle.
Four simulations were done, each representing a different location of the obstacle. These simulations
resulted in trajectories around the obstacle, but the trajectories found did not seem to be (near) timeoptimal. This is probably caused by the fact that control signal of the time-optimal trajectories consists
of singular arcs and a lot of bang-bang segments are needed to approximate these singular arcs.
To conclude, the research question was only partially answered, since only for the one and two dof
RA time-optimal trajectories were found for obstacle free workspace. The time-optimal trajectories for
workspaces with obstacles can be found with the modified RRT* algorithm presented in this thesis but
longer simulation time is needed than was available during this thesis.
7.2
Recommendations
More research is needed to find the time-optimal trajectories of the Resonating Arm around obstacles.
The RRT* algorithm as it was implemented during this thesis was not able to find the time-optimal trajectories within in reasonable time. That would be a reason to use a more general approach algorithm like
dynamic programming, reinforcement learning or Branch and bound. However, the performance of the
RRT* algorithm can be improved so that it is able to find time-optimal trajectories within a reasonable
amount of time.
The RRT* algorithm can be improved by using a better steering function. Such a steering function would
rely on a numerical method and would ideally be able to deal with path constraints and be faster than the
current implementation of the steering function. A multiple shooting algorithm could be a good option
to use for the steering function.
The algorithm can also be improved by using a more accurate sampling region. The sampling region
used in this thesis was a hyper rectangle. This sampling region did include states that are not reachable
by the system. A more accurate sampling region can be obtained by applying random controls during
a simulation and store the states that are reached during this simulation. These states can be used to
calculate a convex hull around these states. This convex hull can be used as the sampling region. In this
way less states are sampled that are not reachable, so the algorithm becomes more efficient because it
spends less calculation time on samples that are not useful. If there is some knowledge about how the
trajectory should look like, this knowledge can also be used to adjust the sampling region so that better
samples are selected.
89
In this thesis it was shown that improving the distance metric has a great effect on the performance of
the algorithm. Techniques whereby the distance pseudometric is approximated [13] can be very useful
and improve the performance of the algorithm.
Next to that, techniques that were already used to improve the RRT algorithm can also be used to improve
the performance of the RRT* algorithm. Examples of these techniques are: lazy collision checking [38],
using inevitable collision states [10] and using dynamic-domains [43].
90
Appendix A
Parallel Spring Mechanism:
derivation of relation
The relation between the torque and the angle and the relation between the potential energy and the
angle of the parallel spring mechanism are derived in this appendix. Before the relations can be derived
a model of the parallel spring mechanism is needed. This model is given in Figure A.1. This figure also
includes a table with the design variables. These design variables determine the relations.
Parameter
Symbol
Value
Radius large pulley
Radius small pulley
Transfer ratio
Distance between pulleys
Diameter of the axis
Spring stiffness
Initial spring length
Pre-stress in the spring
R1
R2
R
dl
d
k
l0
F0
0.105
0.02
5.4
0.1
0.005
150
0.1
6
Figure A.1: Schematic drawing of the spring mechanism with the parameters of the model included in
the table. This model is used to derive the potential energy and torque characteristic of Figure 2.3.
In order to calculate the potential energy stored in the spring mechanism, the elongation of the spring
as a function of the rotation θ has to be calculated. This elongation is determined by calculating the
distance between the two attachment points of the spring and then subtracting the free length of the
spring. The attachment points of the spring are (x1 ,y1 ) and (x2 ,y2 ) which can be calculated by:
x1
= −r1 sin(θ)
(A.1)
y1
= −r1 cos(θ)
(A.2)
x2
= r2 sin(Rθ)
(A.3)
y2
= −r1 − r2 − dl + r2 cos(Rθ)
(A.4)
92
[m]
[m]
[-]
[m]
[m]
[N/m]
[m]
[N]
The elongation of the spring is now given with:
∆l =
p
(x2 − x1 )2 + (y2 − y1 )2 + d − l0
(A.5)
∆l =
p
(dl + r1 + r2 − r2 cos(Rθ) − r1 cos(θ))2 + (r2 sin(Rθ) + r1 sin(θ))2 + d − l0
(A.6)
The equation for the potential energy (Ep ) of the system is now given with:
Ep =0.5k∆l2 + F0 ∆l
i2
p
kh
Ep = d − l0 + (dl + r1 + r2 − r2 cos(Rθ) − r1 cos(θ))2 + (r2 sin(Rθ) + r1 sin(θ))2
2
h
i
p
+ F 0 d − l0 + (dl + r1 + r2 − r2 cos(Rθ) − r1 cos(θ))2 + (r2 sin(Rθ) + r1 sin(θ))2
(A.7)
(A.8)
The torque that the spring mechanism exerts (Ts ) can be calculated with:
Z
Ep =
−Ts · dθ
(A.9)
Ep
(A.10)
dθ
(F0 + k(d + ∆l − l0 ) [(r1 + dl + r2 )(r1 sin(θ) + Rr2 sin(Rθ)) + (1 − R)r1 r2 sin(θ(R − 1))]
=−
∆l
(A.11)
Ts = −
93
Bibliography
[1] Jrme Barraquand and Jean-Claude Latombe. Robot Motion Planning: A Distributed Representation
Approach. The International Journal of Robotics Research, 10(6):628–649, 1991.
[2] R. Bellman. Dynamic Programming and Lagrange Multipliers. Proceedings of the National Academy
of Science, 42:767–769, 1956.
[3] V.G. BOLTYANSKII, R.V. GAMKRELIDZE, and L.S. PONTRYAGIN. The theory of optimal
processes. i. the maximum principle, 2005.
[4] V. Boor, M.H. Overmars, and A.F. van der Stappen. The gaussian sampling strategy for probabilistic roadmap planners. Robotics and Automation, 1999. Proceedings. 1999 IEEE International
Conference on, 2:1018 –1023 vol.2, 1999.
[5] M.S. Branicky, S.M. LaValle, K. Olson, and Libo Yang. Quasi-randomized path planning. Robotics
and Automation, 2001. Proceedings 2001 ICRA. IEEE International Conference on, 2:1481 – 1487
vol.2, 2001.
[6] A.E. Bryson and Y.C. Ho. Applied Optimal Control: Optimization, Estimation, and Control. Halsted
Press book’. Taylor & Francis, 1975.
[7] Ioan Şucan and Lydia Kavraki. Kinodynamic motion planning by interior-exterior cell exploration.
In Gregory Chirikjian, Howie Choset, Marco Morales, and Todd Murphey, editors, Algorithmic
Foundation of Robotics VIII, volume 57 of Springer Tracts in Advanced Robotics, pages 449–464.
Springer Berlin / Heidelberg, 2009.
[8] M. Diehl, H.G. Bock, H. Diedam, and P.-B. Wieber. Fast direct multiple shooting algorithms for
optimal robot control. In Moritz Diehl and Katja Mombaur, editors, Fast Motions in Biomechanics and Robotics, volume 340 of Lecture Notes in Control and Information Sciences, pages 65–93.
Springer Berlin / Heidelberg, 2006.
[9] L. E. Dubins. On curves of minimal length with a constraint on average curvature, and with
prescribed initial and terminal positions and tangents. American Journal of Mathematics, 79(3):pp.
497–516, 1957.
[10] T. Fraichard and H. Asama. Inevitable collision states. a step towards safer robots? Intelligent
Robots and Systems, 2003. (IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference
on, 1:388 – 393 vol.1, oct. 2003.
[11] Emilio Frazzoli, Munther A. Dahleh, and Eric Feron. Real-time motion planning for agile autonomous
vehicles. AIAA JOURNAL OF GUIDANCE, CONTROL, AND DYNAMICS, 25:116–129, 2000.
[12] Hans P. Geering, Lino Guzzella, Stephan A. R. Hepner, and Christoper H. Onder. Time-optimal
motions of robots in assembly tasks. In Decision and Control, 1985 24th IEEE Conference on,
volume 24, pages 982 –989, dec. 1985.
[13] E. Glassman and R. Tedrake. A quadratic regulator-based heuristic for rapidly exploring state space.
In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages 5021 –5028,
may 2010.
[14] Boris Houska, Hans Joachim Ferreau, and Moritz Diehl. Acado toolkit - an open-source framework for
automatic control and dynamic optimization. Optimal Control Applications and Methods, 32(3):298–
312, 2011.
94
[15] J. Jeon, S. Karaman, and E. Frazzoli. Anytime computation of off-road vehicle maneuvers using the
RRT*. In IEEE Conference on Decision and Control (CDC), 2011.
[16] K Kant and S W Zucker. Toward efficient trajectory planning: the path-velocity decomposition. Int.
J. Rob. Res., 5(3):72–89, 1986.
[17] S. Karaman and E. Frazzoli. Optimal kinodynamic motion planning using incremental samplingbased methods. In IEEE Conference on Decision and Control (CDC), Atlanta, GA, December 2010.
[18] Sertac Karaman and Emilio Frazzoli. Incremental sampling-based algorithms for optimal motion
planning, 2010.
[19] Sertac Karaman and Emilio Frazzoli. Incremental sampling-based algorithms for optimal motion
planning. CoRR, abs/1005.0416, 2010.
[20] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion planning. Int.
J. Rob. Res., 30:846–894, June 2011.
[21] Lydia E. Kavraki, Petr Svestka, Jean-Claude Latombe, and Mark H. Overmars. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. In IEEE International Conference
on Methods and Models in Automation and Robotics, 2005, pages 566–580, 1997.
[22] Y. Koren and J. Borenstein. Potential field methods and their inherent limitations for mobile robot
navigation. Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on,
pages 1398 –1404 vol.2, apr. 1991.
[23] S. M. LaValle. Planning Algorithms. Cambridge University Press, Cambridge, U.K., 2006. Available
at http://planning.cs.uiuc.edu/.
[24] S.M. LaValle and Jr. Kuffner, J.J. Randomized kinodynamic planning. Robotics and Automation,
1999. Proceedings. 1999 IEEE International Conference on, 1:473 –479 vol.1, 1999.
[25] Steven M. Lavalle. Rapidly-exploring random trees: A new tool for path planning. Technical report,
Computer Scienc Dept., Iowa State Univ., 1998.
[26] Stephen R. Lindemann and Steven M. LaValle. Incrementally reducing dispersion by increasing
voronoi bias in rrts, 2004.
[27] Tomás Lozano-Pérez and Michael A. Wesley. An algorithm for planning collision-free paths among
polyhedral obstacles. Commun. ACM, 22(10):560–570, 1979.
[28] S.K. Lucas and C.Y. Kaya. Switching-time computation for bang-bang control laws. In American
Control Conference, 2001. Proceedings of the 2001, volume 1, pages 176 –181 vol.1, 2001.
[29] M.A. Latifi M. Cizniar, M. Fikar. A matlab package for dynamic optimisation of processes. 7th
International Scientific Technical Conference PROCESS CONTROL 2006, 2006.
[30] MATLAB. version 7.14.0.739 (R2012a). The MathWorks Inc., Natick, Massachusetts, 2012.
[31] D.S. Naidu. Optimal control systems. Electrical engineering textbook series. CRC Press, 2003.
[32] A. Perez, S. Karaman, M. Walter, A. Shkolnik, E. Frazzoli, and S. Teller. Asymptotically-optimal
manipulation planning using incremental sampling-based algorithms. In IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS), 2011.
[33] M. C. Plooij and M. Wisse. Using a resonant mechanism to reduce energy consumption in robotic
arms, 2011.
[34] M.C. Plooij. Using a resonant mechanism to reduce energy consumption in robotic arms. Msc.
Thesis, TU Delft, 2011.
95
[35] Principles, Examples, and Jens Clausen. Branch and bound algorithms -, 2003.
[36] Morgan Quigley, Brian Gerkey, Ken Conley, Josh Faust, Tully Foote, Jeremy Leibs, Eric Berger,
Rob Wheeler, and Andrew Ng. Ros: an open-source robot operating system. In Proc. of the IEEE
Intl. Conf. on Robotics and Automation (ICRA) Workshop on Open Source Robotics, Kobe, Japan,
May 2009.
[37] Anil V. Rao, David A. Benson, Christopher Darby, Michael A. Patterson, Camila Francolin, Ilyssa
Sanders, and Geoffrey T. Huntington. Algorithm 902: Gpops, a matlab software for solving multiplephase optimal control problems using the gauss pseudospectral method. ACM Trans. Math. Softw.,
37(2):22:1–22:39, April 2010.
[38] Gildardo Sanchez and Jean-Claude Latombe. A single-query bi-directional probabilistic roadmap
planner with lazy collision checking. In Robotics Research, volume 6, pages 403–417, 2003.
[39] H. R. Sirisena. A gradient method for computing optimal bang-bang controls. International Journal
of Control, 19(2):257–264, 1974.
[40] Richard S. Sutton and Andrew G. Barto. Reinforcement learning i: Introduction, 1998.
[41] L. G. van Willigenburg and R. P. H. Loop. Computation of time-optimal controls applied to rigid
manipulators with friction. International Journal of Control, 54(5):1097–1117, 1991.
[42] M. Yamamoto and A. Mohri. Planning of quasi-minimum time trajectories for robot manipulators
(generation of a bang-bang control). Robotica, 7(01):43–47, 1989.
[43] A. Yershova, L. Jaillet, T. Simeon, and S.M. LaValle. Dynamic-domain rrts: Efficient exploration
by controlling the sampling domain. Robotics and Automation, 2005. ICRA 2005. Proceedings of
the 2005 IEEE International Conference on, pages 3856 – 3861, apr. 2005.
96