Parallelism and Robotics: The Perfect Marriage

Parallelism and Robotics: The
Perfect Marriage
By R .Theron,F.J.Blanco,B.Curto,V.Moreno and
F.J.Garcia
University of Salamanca,Spain
Rejitha Anand
CMPS 5433
INTRODUCTION
Robotics is a fast evolving area of scientific study
 Autonomous robots

Programmed to achieve specific objective
 Goal - formulate a sequence of intelligent movements
 Need parallel processing techniques to manage heavy
computational load

Path planning- search for an optimal route from an
initial location to a desired one avoiding collisions

TERMINOLOGIES
transform – decomposes or separates a waveform or
function into sinusoids of different frequency which sum to the
original waveform
 Discrete Fourier transform(DFT) – numerical algorithm to
compute Fourier transform
 Fast Fourier Transform – is a DFT algorithm which reduces
number of computations form N2 to N log N
Fourier
TERMINOLOGIES
(continued)
Convolution – is an integral that expresses the amount of overlap
of one function g as it is shifted over another function f , therefore
blends one function with another
 Convolution theorem – states Fourier transform of a convolution
is the product of Fourier transforms
 Master-slave paradigm – master task allocates problem. Slave
requests and executes problem from master. Slaves can also send
new tasks to manager for allocation to other slaves

KNOWING THE ENVIRONMENT
Global planning – requires total knowledge of the
environment
 Configuration space (C-space)
 represents the workspace of the robot
 position and orientation of robot defined by a single point
 Evaluation of C-space is computational intensive

CALCULATION OF C-OBSTACLES
Curto and Moreno proposed mathematical formalism
 through evaluation of a convolution of two functions
 one describes the robot A and other describes the
obstacles B
 Convolution theorem
 states Fourier transforms can be used
 CB the function that describes the C-obstacles

TOWARDS A FASTER EVALUATION
Solving the equation
 Need to evaluate the integral product of Fourier transforms
of Nm-r configurations and values of space variables
- N is the resolution chosen to describe the
discrete workspace
- m is the dimension of the c-space
- r the Fourier transform dimension
 When convolution does not exist integration performed
provides new opportunities for parallelization
THREE LEVELS OF PARALLELISM
 Fourier transform level(FFT): the Fourier transform
algorithm which is inherently parallel is the computational tool
to calculate this level
 Space variable level: for those variables where convolution
can’t be found (x r +1,…,x n), it is necessary to perform an
integration which may be processed in parallel
 Configuration variable level: in this case some
configuration variables(q1,…,q r) the convolution must be
performed; trivial parallelization exists for configuration
variables where there is no convolution (q r+1,…,q m)
EXAMPLE
Consider a 2D mobile robot has three configuration variables
- (x,y) - which describe the position that convolute with the
workspace coordinates
- Theta – defines the angle of rotation and does not
convolute
 two levels of parallelism is possible – the Fourier transform
level and the configuration variable level ( for the
accumulation of convolutions products over the values of
theta)
MASTER SLAVE APPROACH
The FFT level is
exploited. The master
asks slaves to
perform the
computation of a
Fourier transform,
and they proceed in
parallel with the well
known FFT algorithm
MASTER SLAVE APPROACH
Shows the configuration
variable level, where
again there is a master and
n slave processes. The
master asks the slaves to
evaluate a slice of the Cspace, that is, the
convolution product for
one value of the
configuration variable for
each slave. Then the
partial results are gathered
and the final C-space is
built.
MASTER SLAVE APPROACH
a more complex
model is where three
level solution is
exploited. Sub master
processes organize
the communications
of partial results, that
are then
communicated to the
master. This method
can be seen as a
single level nested
solution.
A CASE STUDY: IMPLEMENTING
THE PARALLEL ALGORITHM
 Implemented to evaluate configuration space of a mobile
robot moving on a 3D workspace partially occupied with
obstacles
 Robot has a planar movement – position defined by three
configuration variables (x r, y r, theta r)
 Expression for C-space evaluation
POSSIBILITIES
 First the algorithm consists of iteratively performing a series
of basic operations one for each possible robot orientation
,theta r, (i.e., the configuration variable level)
 Second, each orientation needs to accumulate the
convolution products for each plane z (i.e., the spatial variable
level)
 These calculations are independent, therefore they can be
done in parallel. These intermediate results are integrated to
form a final result (instance of data parallelism model).
MASTER
 Master communicates the tasks that need to be calculated to
each of its slaves
 After receiving the result (the slice of the corresponding Cspace) from a slave the master assigns a new task.
 Finally, the master indicates to the slaves that no more
calculations are required. During this process the master builds
a three-dimensional array representing the C-space.
SLAVES
 Need to know which orientation to calculate (determined by the master with the number of
task)
 The slave builds the needed robot bitmap for that orientation and a z plane and obtains its
FFT
 Carries out the point-to-point product of this FFT with the transformed workspace that the
master has provided
 Procedure is repeated a number of times as determined by the highest point of the mobile
robot, and the results are added (i.e., the accumulation of the spatial variable level)
 Inverse transformation applied to the accumulated result to obtain the portion of the C-space
 This intermediate result is in turn, sent to the master, expecting another task to be assigned or
a end of execution signal.
RESULTS
 Two different platforms used to validate the implementation
-A Silicon Graphics Origin 200 computer with four MIPS R10000
processors
-768 Mbytes of memory, and four workstations (PII 266Mhz with
64Mbytes of memory) connected by a FastEthernet network
 Used MPI to develop both implementations
 The results are similar, however, more problems were
encountered with the networked workstations because of
communication difficulties.
DISCRETIZATIONS
In order to work with the FFT algorithm employed for the
needed evaluation of 2D Fourier transforms, three
discretizations of the 3D workspace have been used:
- 64 x 64 x 64,
- 128 x 128 x 128
- and 256 x 256 x 256
Execution times (seconds) for 64 x 64
resolution
Sequential
Parallel
Robot
height
Speedup
0.99
0.73
5
1.36
3.34
1.23
20
2.72
7.30
2.40
45
3.63
10.51
3.21
64
3.27
Execution times (seconds) for 128 x
128 resolution
Sequential
Parallel
Robot
height
Speedup
14.30
5.22
10
2.74
52.81
15.21
40
3.47
85.05
23.45
64
3.63
164.06
44.34
128
3.70
Execution times (seconds) for 256 x
256 resolution
Sequential
Parallel
Robot
height
Speedup
252.45
76.67
15
3.29
376.54
107.51
30
3.50
792.70
226.77
64
3.50
CONCLUSION
Limitations of test environment prompted development and
implementation of a solution optimizing at the configuration
variable level
 Different experiments have been carried out in which the
resulting speedups are very acceptable
 The presented case study provides reasonable results that
could be extrapolated to more complex robotic structures
 More complex structures can hopefully be designed on the
lines of the parallel algorithm studied in this paper

Questions ?