Image Based Collision Detection

Image Based Collision Detection
Rafael Kuffner dos Anjos
Dissertação para obtenção do Grau de Mestre em
Engenharia Informática e de Computadores
Júri
Presidente:
Orientador:
Co-Orientador:
Vogal:
Prof.
Prof.
Prof.
Prof.
João Marques da Silva
João Madeiras Pereira
João Fradinho Oliveira
Fernando Birra
Outubro 2011
Acknowledgements
I would like to thank my adviser Professor João Madeiras Pereira and my co-adviser João
Fradinho Oliveira for all the advice and encouragement given to me during all stages
of this thesis, for helping me figure out the right path to follow while researching and
developing this solution. I am also grateful to Artescan for providing real world point
cloud data for the evaluation of this work.
Also, I thank my university, Instituto Superior Técnico, that during these five years
has put me on incredibly challenging situations, and taught me how to overcome them,
lighting up my passion for computer science and programming. I thank all the Professors
that put their hopes on us students and gave their best to help us learn everything we
need to become competent professionals.
Also I thank very deeply the incredible support I have from my family, who has
always believed in me and on what I could do. For bearing with my late night shifts
while developing this work, and mainly for their emotional and spiritual support. Besides
not having knowledge on computer science, they are my development team.
I thank my girlfriend for her support and patience with me, hearing all my ramblings
about computer science and collision detection throughout this year, and always keeping
my spirits up and a smile on my face.
Above all, I must thank God for guiding my life up until this point, for bringing me
over from Brazil to this amazing country, and for everything I was able to acomplish up
until now. As written on the book of Phillipians chapter 4:13 ”I can do all this through
Him who gives me strength.”
Rafael Kuffner dos Anjos
iv
Resumo
Detecção de colisões é claramente uma tarefa importante num contexto de realidade virtual e simulações, tendo grande responsabilidade no realismo e na eficiência a aplicação
produzida. Esse trabalho explora uma aproximação alternativa ao problema da detecção
de colisões ao usar imagens para representar ambientes 3D complexos ou massivas nuvens
de pontos 3D derivadas de dispositivos de digitalização laser. Várias imagens representando uma secção do modelo em questão produzem juntas uma nova representação 3D
baseada em imagens com precisão à escolha do utilizador. A tarefa de detecção de colisões
é executada eficientemente verificando-se o conteúdo de um grupo de pı́xeis sem influenciar o número de quadros por segundo da simulação, ou ocupar uma quantidade excessiva
de memória. A nossa aproximação foi testada com sete cenários diferentes demonstrando
resultados precisos e escaláveis que colocam as nuvens de pontos como uma alternativa
viável à representação clássica baseada em polı́gonos em certas aplicações.
Palavras-chave
Detecção de colisões
Baseado em Imagens
Nuvens de Pontos
Aplicações Interactivas
Sobreamostragem de polı́gonos
vi
Abstract
Collision detection is clearly an important task in virtual reality and simulation, bearing
a great responsability for the realism and efficiency of the produced application. This
work explores an alternative approach to the problem using images to represent complex
enviroments and buildings created virtually, or massive point clouds derived from laser
scan data. Several layers of images representing a section of the model create together
a new 3D image-based representation with user-chosen precision. Collision detection is
performed efficiently by verifying the content of a set of pixels without disturbing the
frame-rate of the simulation or occupying excessive amounts of memory. Our approach
was tested with seven different scenarios showing precise and highly scalable results that
push point clouds as a viable alternative to classical polygonal representations in speficid
domains of applications.
Keywords
Collision Detection
Image-based
Point-Cloud
Interactive Applications
Polygonal Oversampling
Lisboa, Outubro 2011Rafael Kuffner dos Anjos
viii
List of Figures
1.1
1.2
1.3
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.1
Highly detailed 3D model with 100 million polygons . . . . . . . . . . . .
Scene with 15.000 watermelons from Oblivion, a game by Obsidian, and
point cloud data from real world laser scan. . . . . . . . . . . . . . . . . .
Illustration of how our representation represents a 3D object . . . . . . .
Representation from Loscos et. al [27] where an agent checks on the grid
representation for moving possibilities. On the second subimage from left
being capable of climbing a step, on the rightmost subimage being forced
to turn back. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Late 80’s low polygon count 3D model from Lin-canny paper . . . . . .
Example of Bounding volume hierarchy applied in a simple 2D scenario.
BVH applied to the stanford bunny in Zhang et. al [42] . . . . . . . . .
Illustration of stochastic collision detection, where pairs of randomly selected features are tracked as the objects approach each other . . . . . .
LDI calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example of a medium-size point cloud rendered with QSplat (259.975
points) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Point cloud with high point density . . . . . . . . . . . . . . . . . . . . .
Height-map. Real terrain at left, output height-map at right . . . . . . .
Pipeline of operations executed on the preprocessing stage in order to
create the slices representation . . . . . . . . . . . . . . . . . . . . . . .
Slices creation process, and camera variables . . . . . . . . . . . . . . .
Polygonal model, vertexes only, and generated point cloud . . . . . . . .
oversampled Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Result of subsample . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Technique for surface orientation detection. Red points belong to a vertical structure, grey points to the floor. . . . . . . . . . . . . . . . . . . .
Three slices of an office environment, where walls and floor are clearly
distinguished, aswell as a section of a roof on the entrance. . . . . . . .
2
2
4
. 8
. 9
. 10
. 10
. 12
. 13
. 14
. 17
. 20
.
.
.
.
.
21
22
22
24
25
. 27
. 27
Models used as input for testing. . . . . . . . . . . . . . . . . . . . . . . . 32
ix
x
LIST OF FIGURES
4.2
4.3
4.4
4.5
4.6
Graph picturing average time values in seconds for the pre-processing
stage of each model and configuration. . . . . . . . . . . . . . . . . . . . .
Graph picturing memory cost in megabytes during the pre-processing
stage of each model and configuration. . . . . . . . . . . . . . . . . . . . .
Memory used by the complete application at a given moment during runtime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sequence of the first four maps of the input model Columns. . . . . . . .
Low resolution scenario: small round object interaction and floor collision.
34
36
36
38
39
List of Tables
2.1
Coarse granularity in the compromise between precision and efficiency. . . 15
4.1
4.2
4.3
4.4
4.5
Features of the models used for evaluation .
Average frame-rate during evaluation . . . .
Memory used for obstacle detection . . . . .
Polygonal oversampling results . . . . . . .
Comparison between point cloud techniques
xi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
35
36
38
40
xii
LIST OF TABLES
List of Algorithms
3.1
3.2
3.3
3.4
Slices creation . . . . . . . . . . . . . .
Polygonal model oversampling . . . . .
Points coloring and obstacle detection
Broad-phase and collision detection . .
xiii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
23
26
29
xiv
LIST OF ALGORITHMS
Contents
1 Introduction
1.1 Problem and motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Related Work
2.1 Collision avoidance . . . . . . . . . .
2.2 Collision detection . . . . . . . . . .
2.2.1 Feature based . . . . . . . . .
2.2.2 Bounding volumes hierarchies
2.2.3 Stochastic techniques . . . .
2.2.4 Image based . . . . . . . . . .
2.3 Point Clouds . . . . . . . . . . . . .
2.4 Comparison . . . . . . . . . . . . . .
2.5 Summary . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
(BVH)
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3 Concept and Implementation
3.1 Representation . . . . . . . . . . . . . . . .
3.1.1 Slices creation . . . . . . . . . . . .
3.1.2 Polygonal model oversampling . . .
3.1.3 Information processing and encoding
3.2 Collision detection . . . . . . . . . . . . . .
3.2.1 Broad phase and collision detection
3.2.2 Narrow phase and collision response
3.3 Summary . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
8
8
9
11
11
14
15
16
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
20
21
25
28
28
29
30
.
.
.
.
.
.
31
31
33
34
37
38
39
4 Experimental Results
4.1 Environment and settings . . . . . . . . . . . .
4.2 Pre-processing time and Memory usage . . . .
4.3 Runtime . . . . . . . . . . . . . . . . . . . . . .
4.4 Polygonal oversampling and Obstacle detection
4.5 Collision detection and precision . . . . . . . .
4.6 Evaluation . . . . . . . . . . . . . . . . . . . . .
xv
1
1
3
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xvi
CONTENTS
5 Conclusion and Future work
41
5.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Bibliography
43
Chapter 1
Introduction
1.1 Problem and motivation
In rigid and deformable body simulation, virtual reality, games, and other types of
applications where several 3D objects exist and interact, the proper execution of the
collision detection task results in a huge [htp] to realism. As objects move around
and animate, their parts end up intersecting and if not detected and stopped, create
unnatural behaviours that are unnaceptable in this domain of applications.
Collision detection is normally a bottleneck in the visualization and interaction process, as collisions need to be checked at each frame. Traditionally, the more complicated
and crowded is our scene, the more calculations need to be done, bringing our frame-rate
down. Therefore the optimization of this process, gaining speed without losing quality in
the simulation, is something that has been researched for years using different techniques
and approaches.
With a naive approach, the geometrical testing for intersections between all polygons,
one will not have a scalable solution by any means, and will quickly overload the CPU
with work. So various techniques using bounding volumes, distance fields, and other
auxiliary data structures that group these polygons together or exclude useless testings,
have been developed so less testing is needed at each frame.
Invariantly, in more complex (Figure 1.1) or crowded (Figure 1.2) scenes, where we
wish to keep some degree of realism, hence using more or tighter bounding volumes,
once again one quickly reaches a performance bottleneck. Also, in scenarios that use a
different object representation such as a point cloud, we can’t rely on object topology
information. The classical approaches either won’t work, or will have to heavily adapt
to this specific scenario, tayloring its optimizations to a point cloud representation.
Even with simpler scenes, sometimes one might not have enough processing power
available. Somehow we need to avoid the problem of overloading the CPU with collision
detection tests, and reduce the number of times we need to lookup data structures, so
that the simulation is runnable even on a less capable machine, for example a kiosk
setting.
Using images as an alternative way of executing the task of collision detection might
1
2
CHAPTER 1. INTRODUCTION
Figure 1.1: Highly detailed 3D model with 100 million polygons
Figure 1.2: Scene with 15.000 watermelons from Oblivion, a game by Obsidian, and
point cloud data from real world laser scan.
1.2. CONTRIBUTION
3
just be the answer. With the recent advances of graphic cards, a lot of the processing
can be done in the GPUs, wisely enough not to compromise the rendering process. Also
when using images, we have a very easy way to choose how much precision we want on
the testing, choosing the resolution we will be using. With higher and higher resolution,
we can end up reaching a completely realistic precision.
Another advantage of images, and probably the biggest one, is that most of the
algorithms are independent of the object’s topology. It doesn’t matter whether we have
a pointy object, a round one, or even a point cloud, as all we’re dealing with is the
object’s image representation.
Being a scalable and promissing technique, Image-based collision detection seems to
be a plausible alternative to the classical approaches.
Given the fact that the area of collision detection is a broad field of research, and
knowing that several techniques have been presented excelling in certain scenarios, it’s
important to establish the scenario we are focused on, and what we have aimed for with
this solution.
In a virtual reality navigation scenario, we want to have a rich and detailed scenario
to explore. And when we’re representing data coming from the real world, the high
complexity of these enviroments is a given fact. Laser scans can present us with enourmous point clouds with which we might not be able to deal with, recurring to usual
rendering techniques. Adding to this the fact that the hardware available might not
meet graphics performance requirements for the model or algorithm at hand, a situation
that commonly will happen in tourism hotspots, museums, or other places where we
want ordinary people to interact with the system. On this specific context, we need an
application to handle one to one interactions between a pawn avatar (typical on these
settings) and a complex enviroment represented by a single point cloud or 3D model.
This will be the main objective of our work.
1.2 Contribution
The main contribution of our research is a new 3D world representation for enviroment
and buildings completely image-based containing the same information as an objectbased representation would contain for collision queries. Slicing the structure along the
Z axis (Figure 1.3) , we create a discrete set of images containing height information
about the surface, and possible collidable frontiers. It’s a flexible format, that is able to
represent either point clouds or polygonal models.
Thus, we’ve developed a simple algorithm that uses this output information to create
a completely interactive navigation experience with the input model, where the user can
explore the enviroment with a custom precision level on the collision detection performed.
We demonstrate this without using any special graphic card feature, altough one might
argue that it would improve the effectiveness of the process. The following outline lists
the main contributions resulting from our research:
• New 3D world representation: A set of 2.5D images representing slices to
a physical structure, together construct a 3D representation of the input model.
4
CHAPTER 1. INTRODUCTION
Figure 1.3: Illustration of how our representation represents a 3D object
Each 2D map contains precise information about surfaces and collidable frontiers,
this precision is defined by the density of the points in the model, and the desired
resolution of the output images.
• Multi-tier solution for collision detection: In a first phase, a possible collision
can be identified in constant time, leaving more precise testing to a second tier.
This division allows us to freely adapt the application to the requirements needed.
• Simple but precise collision response: A simple algorithm using our new
3D world representation has been developed to allow basic navigation on a input
enviroment. Only using the simple broad phase collision detection algorithm and
2D collision response, we have achieved a precise interaction that works on a 3D
environment.
• An input independent algorithm: Point clouds and polygonal models (triangle meshes, CAD, etc) are processed in a similar way, making our representation
effective in both cases. These are the main inputs we will be receiving on the
described scenario.
• High scalability and efficiency: A collision detection technique that is only
affected by input complexity on a pre-processing stage, and has close to no impact
on the rendering cycle of the application, thus not representing a bottleneck on
the visualization cycle.
1.3. OUTLINE
5
1.3 Outline
This document describes throughly the research carried out and the developed work in
the following structure:
• Chapter 1: Introduction, problem and motivation, resulting contributions.
• Chapter 2: Description of the related work on collision detection and avoidance,
highlighting the most influential research on each field and comparing and analyzing their performance on our chosen scenario.
• Chapter 3: Concept and implementation details. We describe each step of the
pre-processing stage where the slices representation is created, followed by both the
broad-phase collision detection process and the narrow-phase and collision response
operation.
• Chapter 4: Test results and performance evaluation of our work on several aspects. A critical analysis and comparison with other work on the community is
also made.
• Chapter 5: Conclusion and overview of possible improvements on the developed
work.
6
CHAPTER 1. INTRODUCTION
Chapter 2
Related Work
Given that the problem of computing object’s collisions is not only present in the areas
of virtual reality, games and simulations, but also in different fields such as artificial
intelligence and robotics, there is quite a vast amount of work often optimized for the
different respective contexts that can mislead our research in the wrong direction. Even
when talking about collision detection in a single domain, such as simulations, two
applications may still be hugely different regarding requirements and concerns. The
attributes of the objects such as defformability, breakability or surface material bring up
new algorithms and techniques to solve these problems in an efficient way. Therefore, we
start by establishing the difference between collision avoidance and collision detection,
then a short overview of relevant work on processing point clouds is given, followed
by a classification and an overview of the most relevant work in the virtual reality
community, and finally, a short conclusion fitting our research in the given state of the
art. For further information on collision detection and avoidance techniques we suggest
the following surveys: [34] [9] [40]
2.1 Collision avoidance
The goal of collision avoidance is to predict an upcoming collision, and use this information to prevent it from happening. It is normaly used in crowd simulations, or any
other environment where the object to avoid colliding is an intelligent agent. This work
differs from collision detection in the way that we just need to identify the objects in the
world, and avoid intersecting with them, not letting it happen and then correcting it as
one would do when walking in the real world. This can be achieved through intelligent
world perception and representation.
The collision can be avoided in many different ways. In the BOIDS framework [36]
simple circular spheres are used to detect obstacles in the vicinity. On Ondrej et. al [35]
work, a more complex technique is presented that simulates the human vision to detect
obstacles on the agent’s path. ClearPath [10] uses the concept of Velocity Obstacles,
that takes into account the velocity of the other objects in the scene, and how the
current object’s velocity should adapt to avoid a collision. Finally, Loscos et. al [27]
7
8
CHAPTER 2. RELATED WORK
Figure 2.1: Representation from Loscos et. al [27] where an agent checks on the grid
representation for moving possibilities. On the second subimage from left being capable
of climbing a step, on the rightmost subimage being forced to turn back.
represent the environment as patches to where the agent can or cannot go according to
it’s occupation.
After the several different presented techniques are applied, different methods to
calculate the detour of the agent to avoid that obstacle are used. Some of the techniques
involve complex mathematical systems, other techniques are basic, such as stopping the
agent for a while. This already belongs to a domain in Robotics called path planning,
unrelated to what’s covered in our work.
2.2 Collision detection
Different lines of work have been developed in the area of collision detection. Each of
the techniques has developed a unique representation of the world and it’s objects, being
applied differently and with certain advantages in each specific domain of applications.
Simple games and environment navigation can work well with simple bounding volumes
or distance fields, while cloth simulation and deformable body require tighter volumes,
hierarchies, or combining efforts with Image or feature based techniques. It’s important
to keep in mind that all these techniques have a field where they excel, therefore it’s
important to consider all of their capabilities when working in collision detection
2.2.1 Feature based
These algorithms use the vertices, edges and faces (so called features) of the object’s
boundary to perform geometric computations to determine if a pair of polyhedra intersect
and possibly calculate the distance between them. The 1988 work from Moore and
Wilhelms [31] describes the mathematical process of detecting the intersection between
two polygons. Other famous examples are the Lin-Canny algorithm[24] and it’s more
recent related algorithm, V-Clip[29]. In V-Clip they keep track of the closest features
between two objects, in this way they derive both the separation distance, and the
vertices that have possibly already penetrated the other object, something that wasn’t
possible with the Lin-Canny Algorithm. This technique called Voronoi Marching [29]
has been applied in SWIFT [7], where a hierarchical representation of convex polyhedra
2.2. COLLISION DETECTION
9
that englobes the features is created, and the algorithm is applied to each level of the
hierarchy. These algorithms have been used mostly in robotics and path planning, since
these algorithms don’t behave very well when objects penetrate each other. On the other
hand, the mathematics behind simple polygon intersection is still sometimes used when
a detailed collision detection is needed. Given the high number of triangles of today’s
3D models, this technique is never used alone. It may be easily associated with BVHs
in the lower levels.
Figure 2.2: Late 80’s low polygon count 3D model from Lin-canny paper
2.2.2 Bounding volumes hierarchies (BVH)
Being by far the most popular class of collision detection algorithms, being applied successfully in every research area, BVHs work by dividing the objects in smalller primitives
contained by it, until a certain leaf criteria is achieved (top-down approach) or starting on
the smallest primitives possible, grouping up until a single volume is achieved (bottomup) as illustrated in Figure 2.3. Each primitive is enveloped by a particular bounding
volume shape, for example the Sphere [17] which is the most popular approach, the Oriented Bounding Box (OBB) [14] and the Discrete Oriented Polytope (DOP), a convex
object made of k planes that envelop the primitive. The Axis Aligned Bounding Box
(AABB) [22] [13] [42] is the most used version of the latter, being a k-DOP [?, kdop]ith
k=6. Each of these volumes has a particular advantage: Spheres are easier to fit, OBBs
have faster pruning capabilities, and AABBs are quicker to update, therefore being very
10
CHAPTER 2. RELATED WORK
popular in deformable body simulation.
Figure 2.3: Example of Bounding volume hierarchy applied in a simple 2D scenario.
Figure 2.4: BVH applied to the stanford bunny in Zhang et. al [42]
When checking for collisions between two objects, the hierarchy is traversed topdown, testing only elements which have parents that have collided with another BV,
avoiding useless calculations. The efficiency of this relies on the tightness of the bounding
volumes, and the topology of the hierarchy. Rigid body similation requires BVH’s that
are fast to search and calculate subsequent collisions. In deformable body simulation the
hierarchy must in one simulation step refit or reconstruct after a deformation. Different
tree traversing and creation techniques [22] [14] have been developed to optimize these
expensive operations, taking into accound each specific kind of application.
The simplicity and efficiency of these methods are the features that have made them
so popular and wide-spread used in the community. although, scalability problems still
2.2. COLLISION DETECTION
11
exist with BVH’s. We don’t always have a balanced and prunable tree, and when higher
fidelity on more complex models is needed, the number of bounding volumes may end
up increasing very fast, making all the previous operations very costly. Even with a
model with simple genus, if we seek more fidelity in the collision detection, we need to
heavily increase the number of bounding volumes to approach a realistic solution as seen
in Figure 2.4.
2.2.3 Stochastic techniques
Stochastic algorithms that try to give a faster but less exact answer have been developed, giving the developer the option to ”buy” exactness in collisions with computing
power. Uno and Slater [41] present a study regarding the sensibility of people to exact
collision response, stating that humans cannot distinguish between physically-correct
and physically-plausible behaviour. Relaxing the collision criteria may lead to faster
and lighter algorithms.
The ”Average-case” approach is applied in BVHs, not really checking for intersections
between volumes in the hierarchy, but calculating the probability that they contain
intersecting polygons. In BVH’s based on spatial subdivision, this would mean trying
to find a grid that contained several polygons from object A and object B, intersecting
with high probability. The technique based on Randomly selected Primitives [?, ?, MT]
selects random pairs of features that are probable to collide, and calculates the distance
between them. The local minima is kept for the next step and calculations are once
again made. The exact collision pairs are derived with Lin-Canny [24] feature based
algorithm.
With a similar idea, Kimmerle et. al [37] have applied BVH’s with lazy hierarchy
updates and stochastic techniques to defformable objects and cloth. After detecting
collision between the bounding volumes, pairs of features contained in this volumes are
randomly selected, and temporal coeherence between them is kept, updating the pairs to
the closest neighbour (Figure 2.5). When a threshold of proximity is reached, a collision
is declared, and thus, corrected.
Although a plausible result is achieved, it is impossible to reach a physically-correct or
exact simulation with these techniques. In certain domains such as VR medical surgery
such uncertainty is unacceptable, and when dealing with point clouds, penetrations are
specially unnatural to see. Also, as these techniques rely largely on Bounding volumes
hierarchies, they still share some of the same disadvantages mentioned earlier, specially
when dealing with unstructured data such as point clouds or polygon soups.
2.2.4 Image based
Several image-space techniques have been developed recently, as the advance of graphic
cards has made their implementation not only viable in terms of processing power and
memory allocation, but it has become simpler to implement these techniques using the
programmable graphic units.
12
CHAPTER 2. RELATED WORK
Figure 2.5: Illustration of stochastic collision detection, where pairs of randomly selected
features are tracked as the objects approach each other
The algorithms commonly work with the projection of the objects, opposed to the
previous techniques that work in object space. RECODE [5] transforms the probem of
three-dimensional collision detection to a one-dimensional interval test along the Z-axis.
By using a mask in the stencil buffer based on one of the two objects coordinates (Zmax
or Zmin) a stencil test is made, and the result tells us if the two objects overlap or not.
This stencil test is also the base of several other works that try to take advantage of
the new graphic cards capabilities [21] [32], and are also being extended to achieve self
collision in deformable objects [4].
Hoff et. al [8] treat 2D collisions using the bounding boxes to detect possible collisions, and then creating a grid of that region, enabling proximity queries to be made using
the graphics hardware to detect overlapping polygons, hence allowing several queries to
be done regarding the intersection.
2.2. COLLISION DETECTION
13
CULLIDE [15] uses occlusion queries only to detect potentially colliding objects,
and then triangle intersection is made on the CPU. Collisions are detected based on a
simple lemma that states: ”An object O does not collide with a set of objects S if O
is fully visible with respect to S.” they can quickly prune objects from the pottentially
colliding set (PCS). Rendering bounding volumes of the objects in normal and reverse
list storage order, they remove the objects that are fully visible in both passes. This
is applied iteratively until there is no change in the PCS. Boldt and Meyer [6] have
extended this algorithm to contemplate self collisions, exchanging the ”lesser than” test
in the depth buffer for a ”lesser than or equal”, thus guaranteeing that self collisions
would be detected, however unfortunately they create some performance problems with
closed objects, that clearly will never be fully visible in relation to themselves, thus never
leaving the PCS.
The recognized capability of culling and quickly avoiding useless collision tests when
working in image-space has been recognized and applied in [19], where closest point
query is implemented in the Z-buffer, aided by a convex hull that is created enveloping
a target number of polygons.
Heidelberger et. al [16] uses simple AABB’s as bounding volumes for the objects in
the scene. Potentially colliding objects are detected, and a LDI (Layered Depth Image
[38]) of the intersection volume between the two objects is created. That is, a volumetric
representation of an object across a chosen axis. At each rendering step, as a polygon is
projected into the LDI, the size of the intersection volume is computed. (Figure 2.6)
Figure 2.6: LDI calculation
Faure et. al [12] adresses not only collision detection, but also it’s response. Using
the same principle as [16], it also calculates the volume derivative related to each viewing
axis (gradient) . Penalty forces are easy to derive from the gradient calculated, as one
assumes these forces are trying to minimize the intersection volume. Applyied to each of
the object features, it makes separation between intersected objects possible, and taking
into account object topology,simulation of defformable objects can be achieved. In a
more recent work [3] they have expanded the functionality of the extended LDI, now
used to calculate friction forces in collision response.
As seen, the concept of Layered Depth Images [38] was important to the area of
collision detection [12], even tough its first purpose was targeted for image based rendering. A Layered Depth Cube [26] and a LDI-Tree [11] are newer concepts that have been
derived from LDIs, and serve as an elegant and efficient world representation in certain
14
CHAPTER 2. RELATED WORK
domains. In a scene where the polygon count, or point count, is too high to be stored,
an image is a much more compact representation.
2.3 Point Clouds
Point clouds are the result of a 3D laser scan, normally used to create 3D CAD models for
manufatured parts, building inspection or visualization, and sometimes animation. The
information is not commonly used exactly as it was captured, normally being preprocessed first. Noise removal, surface reconstruction, are some of the operations performed.
Nowadays, 3D laser scans have made possible for one to create a model with hundreds
of millions of polygons. As ordinary simplification and display algorithms are impratical
for such massive data sets, rendering techniques for point-clouds have been developed
so we can discard polygon information, and still have a realistic visualization.
QSplat [39] uses a multiresolution hierarchy based on bounding spheres to render high
complexity meshes that are transformed into point clouds, and has been applied on the
museum kiosk scenario described on Chapter one. Lars Linsen [25] has also worked with
point cloud visualization without needing triangle information, in contrast with QSplat
[39] which uses this information if available, otherwise estimating the normals by fitting
a plane to the vertices in a small neighbourhood around each point. These techniques
have made point cloud representation more viable on a real interactive application, as
they are already clearly lighter than the traditional ones.
Figure 2.7: Example of a medium-size point cloud rendered with QSplat (259.975 points)
Several surface reconstruction techniques have been developed. Some techniques
use the idea from Marching Cubes algorithm [23] [33] and convert the point cloud into
2.4. COMPARISON
15
a volumetric distance field, thus reconstructing the implicit surface. Poisson Surface
Reconstruction [18] can reconstruct the surfaces with high precision and has been applied
in 3D mesh processing applications such as MeshLab [28] . These reconstructions are
made so that the models can be used as polygonal meshes. It still differs from a hand
made CAD model where objects are divided and coherently structured making collision
detection easier.
Regarding collision detection, algorithms using feature based techniques, bounding
volumes, and spatial subdivision have been developed. Similar to the idea presented in
SWIFT [7], Klein and Zachmann [20] create bounding volumes on groups of points so
collision detection can be normally applied. Figueiredo et. al [13] uses spatial subdivision
to group points in the same voxel, and BVHs to perform collision detection.
With point cloud data resulting from 3D laser scans, the tight fitting of the bounding
boxes becomes less and less reliable due to the non constant density of points, as stated
by Figueiredo et. al [13]. Performing Image-based collision detection, the precision of
the detection is directly proportional to the characteristics/quality of the input, since
we work with the visual representation of the point cloud, not needing to estimate the
frontiers of the object.
Although they present us with apparent advantages on this scenario, Image Based
techniques still haven’t been applied or studied in this domain.
2.4 Comparison
Having adressed these techniques and stated their advantages and disadvantages, we
can see what a tight competition it is. There is no perfect technique that casts a huge
shadow over the others. Each of them has a different and innovative approach to the
problem, and has an edge over the others in certain situations. In Table 1 we have a
quick comparison of the refered techniques.
Table 2.1: Coarse granularity in the compromise between precision and efficiency.
Feature based
BVH
Advantages
Simple implementation, precise
Popular implementation, relatively scalable, reliable
Stochastic
Fast and adaptable
Image-based
Scalable and precise, topology
independent, works with unstructured polygon soups
Disadvantages
Not Scalable
Compromise between precision and efficiency hard to obtain, requires input data to be
properly segmented
Not
completely
reliable,
shares all of BVH’S issues.
Requires specific graphic card
capabilities
Feature based algorithms are the pioneers of this area, and used to work with low-
16
CHAPTER 2. RELATED WORK
polygon 3D models, there are situations where they perform very well, but suffer with
polygon counts commonly found today. That is why some more recent approaches such
as SWIFT [7] started to rely on hierarchical representations such as the ones used in
BVH’s. These, besides being the most used technique nowadays, are far from being
perfect. Scalability is achieved with better hierarchies and bounding volumes loosening.
Such a compromise between efficiency and quality/realism is hard, yet achievable, even
with high complexity models. While dealing with point clouds, there are techniques
[20] [13] that create bounding volumes in groups of points, simillary to what was done
in feature based algorithms. Besides being a viable answer, we still face the question
between efficiency and precision.
Stochastic techniques give us the capability of choosing the precision of our tests
very easily, and have also been used with point clouds, since the basis of some of these
works is point-wise comparsion. But we can’t expect a completely correct simulation
when running these algorithms, the compromise between efficiency and quality/realism
being harder to achieve on these situations.
With Image based techniques, we can perform collision detection with complex objects [12] since we are working on image-space, and the number of objects can be quickly
pruned as shown in CULLIDE [15], thus guaranteeing scalability. The precision of the
testing can also be chosen by choosing the resolution of the images, and if we needed, we
can theoretically have as much precision as feature based algorithms. By treating the
objects in image space, we are not dependant on the type of object we are working with.
Point clouds can be efficiently dealt with, since the increase in the number of points takes
us closer to a closed surface as shown in Figure 2.8 (even more evident when viewed at
a distance) thus working to our advantage.
One drawback that could be stated, is the graphic cards capabilities that they may
require. The work described in the next chapter, has not required any special feature
to be implemented. Everything, all the calculations were done on the CPU, although
some steps could be speeded up if implemented on the graphic card, Practical results
and the concept itself of image-based collision detection show that it is viable without
those capabilities.
2.5 Summary
Even though research on collision detection has been sucessfuly advancing throughout
the last years, a technique that works perfectly well in every scenario has not yet been
developed, pushing us towards developing different approaches to excel in each specific
scenario. Being the most traditional approach, BVH’s have been experimented in several
different scenarios, being effective in most of these scenarios, but outshined by imagebased techniques or stochastic approaches in specific situations, such as cloth simulation
and deformable objects. Image-based collision detection has not yet been applied to
certain scenarios, mainly due to the fact that it is a recent field.
As shown in section 2.3, point cloud representation is a viable and light alternative
regarding data structures and visualization, and most probably the optimal one for laser
2.5. SUMMARY
17
scan data output, thus, being efficient on the museum kiosk scenario, or computers with
low computing power. The research on collision detection with point clouds is on its
early stages, as both techniques mentioned [20] [13] apply the classic approach of BVH’s.
Image-based approaches can handle this case without requiring special adaptation for a
given representation of object, thus being our technique of choice.
While our approach may not achieve the levels of realism that some outstanding
work presented here have, we have tailored the developed solution to excel in the described scenario. A light and highly scalable image-based solution for environments and
structures represented in point clouds or classical polygonal models, that may also be
applied in different possible scenarios to be described on Chapter 5.
Figure 2.8: Point cloud with high point density
18
CHAPTER 2. RELATED WORK
Chapter 3
Concept and Implementation
This chapter presents the developed work in full detail. We present an innovative Imagebased object representation in section 3.1 and describe the pre-processing step where it
is created. Section 3.2 describes how we can apply this representation in the collision
detection task in a broad and in a narrow phase.
3.1 Representation
Image-based algorithms that have been presented in the community ([15] [5] [21] [6] [3]
[12]) perform very well in various kinds of scenarios, but some features of our set scenario
(described on chapter 1.1) make them hard or impossible to be applied (e.g. our data is
unstructured, not all objects are closed or convex).
Besides working with traditional models, we also receive point clouds as inputs. These
normaly cover a large set of objecs, and do not provide us with any object structure.
This makes it hard or even impossible to fit traditional bounding boxes around each
individual object as done on the work from Faure et. al [3] [12] . Also, when dealing
with navigation, one may go completely inside buildings and structures. This situation
associated with the lack of segmentation of the scene, makes it impossible for some of
the approaches to work ([5] [16] [15] [5] [21] [6]) ), since they rely on polygons of closed
objects that define their boundaries, telling them whether there is a collision or not.
In not so complicated scenarios, images have been used for collision avoidance. The
work from Loscos et. al [27] as an example, uses a 2.5D map of the city showing which
positions were occupied by walls or other solid objects by painted pixels. Also, there is
a height information associated to the objects to distinguish whether the agent could
or couldn’t climb that obstacle, such as a small step. This simple representation of a
2D tone map, can be used for collision detection. Similar techniques have been applied
already on the 2D era of gaming, where collisions were calculated pixel-wise.
This idea of a 2.5D map has been extented to a height-map on a terrain visualization
context as shown on figure 3.1, where a color code is used to represent the height of a
certain point on the map. This representation holds more information than the maps
on ClearPath [27], but cannot be used for collision avoidance or detection, as it does not
19
20
CHAPTER 3. CONCEPT AND IMPLEMENTATION
Figure 3.1: Height-map. Real terrain at left, output height-map at right
contain information about obstacles, just the ground level is represented. Both heightmaps or 2.5D maps do not support a two-stories scenario , as they only hold one z
coordinate for each (x, y) pair.
Our representation combines these two important features, and overcomes the limitations of only supporting single floor environments. Instead of having just one height
map, we create a series of special maps along intervals sized sigma on the z axis, thus
enabling the storage of more than a single z value for each (x, y) pair. Using the color of
each pixel as a representation of a voxel, we write height information on the red channel,
and identify obstacles on the blue channel. By adding these variables, we can determine
not only the height where the avatar should be standing, but also if he is collinding with
any obstacle in several different heights.
3.1.1 Slices creation
The creation of this representation is executed in a pre-processing stage, divided in
several steps (Figure 3.2) that must be performed from the input of the model until the
actual rendering to create the snapshots that will be the used as collision maps. As
stated in section 1.2, we will always try to follow the implementation choice that does
not require graphic cards or specific API implementation so as to test the robustness of
the approach.
Algorithm 3.1 describes the process of creating these slices. It sets up the camera
according to the previously calculated bounding boxes of the input model on an orthogonal projection. After each rendering of that projection, a snapshot sized σ is created
and saved onto the disk for further use.The camera then is moved along the z axis, and
the process is repeated until the whole extension of the model has been rendered onto
images. A visual representation of the variables mentioned and the slices made on an
example model can be seen on figure 3.3
Creating these slices is the heavier operation to be executed of the whole process,
since writing full images to the disk is an operation hard to optimize, and is dependant
3.1. REPRESENTATION
21
Figure 3.2: Pipeline of operations executed on the preprocessing stage in order to create
the slices representation
Algorithm 3.1 Slices creation
for z = zm in → z = zmax do
zcam ← z + σ
nearcam ← 0
f arcam ← σ
lef tcam ← max(ymax , xmax )
rightcam ← min(ymin , xmin )
bottomcam ← max(ymax , xmax )
topcam ← min(ymin , xmin )
render and snapshot
end for
on the BUS velocity. Compressing the images is an alternative considered on Section
5.1
However, just rendering the input model as it is, does not produce the representation
we require for our collision tests. We still need to calculate the height for each point
that is going to be rendered, and identify every possible obstacle for a given pixel. This
information will be coded on the color of each point.
3.1.2 Polygonal model oversampling
Although we aim for a solution that accepts both polygonal models and point clouds, it
is not possible to treat them as if they were unique. We focused ourselves on an approach
that is easy to apply on point clouds, as they can be considered a simpler representation
of a polygonal model. The process of reconstructing a point cloud into a polygonal
model as described on some works ([18] [23] [33]), is much heavier and inprecise than
the reverse, which is as simple as just discarding the triangles and rendering only the
vertices of the input model.
22
CHAPTER 3. CONCEPT AND IMPLEMENTATION
Figure 3.3: Slices creation process, and camera variables
There are two reasons why we can’t apply our solution directly to polygonal models.
The first is the fact that any triangle that is directly perpendicular to the camera would
not have a representation on the rendered ouput, thus not allowing us to print collision
information about it on our map. The second one, is due to the fact that we must choose
the color each pixel will have according to the points that will be projected there. In a
polygonal model we can have a surface that covers several pixels, while just having three
points. In this situation, our processing would have to be shifted to a pixel shader, as we
would have only three vertices being processed, and several pixels being affected. This
would go against our premise of not using special hardware features, and even with pixel
shaders, the perpendicular triangles still would be a problem, as they would not trigger
our shader due to the fact that they do not occupy any pixel on the result rendering.
Figure 3.4: Polygonal model, vertexes only, and generated point cloud
If our model has a low polygon count though, we cannot simply discard the triangles,
3.1. REPRESENTATION
23
as our point cloud would be a very sparse one, not being a truthful representation of
the structure itself. A simple oversampling operation that operates on a triangle level
can create a perfect point cloud with a user-choice level of precision. Figure 3.4 shows
an average polygon count model that describes this situation exactly. After discarding
the polygons, we do not get a faithful representation of the original shapes, but after
producing a point cloud through oversampling, the shape is exactly as the polygonal
representation.
Algorithm 3.2 Polygonal model oversampling
for all triangles 4abc in m do
if dist(·a, ·b) or dist(·a, ·c) or dist(·b, ·c) > t then
, n2 ← dist(·a,·c)
,n3 ← dist(·b,·c)
n1 ← dist(·a,·b)
t
t
t
i←1
while i < n1 or i < n2 do
·abi ← ·a + i/n1 ∗ (·a − ·b)
·aci ← ·a + i/n2 ∗ (·a − ·c)
ni ← dist(·abt i ,·aci )
j←1
while j < ni do
·abacj ← ·ab + j/ni ∗ (·ab − ·ac)
j ←j+1
end while
end while
if n1 > n2 then
nx ← n1 , ·x ← b, ·y ← ·c
else
nx = n2 , ·x ← ·c, ·y ← ·b
end if
k←1
while i < nx or k < n3 do
·axi ← ·a + i/nx ∗ (·a − ·x)
·xyi ← ·x + i/n2 ∗ (·x − ·y)
nk ← dist(·axt i ,·xyi )
j←1
while j < nk do
·axxyj ← ·ax + j/ni ∗ (·ax − ·xy)
j ←j+1
end while
end while
end if
end for
Algorithm 3.2 describes this operation in detail, each iteration producing a triangle
24
CHAPTER 3. CONCEPT AND IMPLEMENTATION
as Figure 3.5. The key variable that controls the precision of the output point cloud
is the treshold t, the minimum distance desired between two points. A perfect output
would have one point fitting each pixel of our map. As we know the user-chosen resolution
(w, h) of the output image, we can calculate the perfect threshold by a simple coordinate
change from object to image space.
max(sizex , sizez ) ⇐⇒ max(w, h)
(3.1)
t ⇐⇒ 1
max(sizex , sizez )
t=
max(w, h)
(3.2)
(3.3)
Variables sizex and sizez represent the calculated bounding box size of the whole
input on the x and z axis, respectively. We find through a rule of thirds what is the size
of a pixel on object space, and that is our desired threshold t.
For each triangle 4abc we check if the distance between any of its points is smaller
than t. If one meets this criteria, we are sure that any point inside this triangle is closer
than t from an edge, otherwise we will apply the oversampling operation to it. Variables
n1 , n2 and n3 represent the number of points to be created between two given points
(x, y) in order to fit the threshold and are given by dist(x,y)
.
t
Starting on point a, we start to create points along the edges ab and ac. These
are represented in yellow on Figure 3.5. Each pair of points (abi , aci ) created is then
considered a new edge, and ni points are created along it, represented on blue on Figure
3.5.
Figure 3.5: oversampled Triangle
When one of the edges is completed, we start creating points along bc. The oversampled points created along this edge are interpolated with the points still being created
on the uncompleted edge ab or ac, thus filling the surface of the triangle.
The whole process applied to the input model creates a point cloud with precision
defined by t. Figure 3.6 shows the result on a cathedral environment, where walls,
3.1. REPRESENTATION
25
windows and columns keep the exact shapes of the original model while having a very
high sampling rate.
Figure 3.6: Result of subsample
3.1.3 Information processing and encoding
At this stage we can assume that all our models are, or have been converted into a pointcloud representation, thus we can now define how the information from the input model
is going to be encoded into the slices. Each pixel on the output image represents a voxel
sized (t,t,σ) on object space, where we paint a color representing the points contained
on that area. Algorithm 3.3 performs both the operation of height map information
encoding, and obstacle detection.
The color of each point is assigned on this function. These colors do not replace
the actual colors of the vertices, but are calculated temporarily while constructing the
collision maps. The first operation is executed as follows: We calculate the difference
between the current point z coordinate and the models lower bounding box zmin , and
apply the modulo operator with σ. This remainder r represents the points z coordinate
on an interval [0, σ].
To be used as a color value it must belong on the interval [0, 1], so we calculate σr ,
deriving finally the red channel value. The simplified formula is given by:
red ←
abs(z − zmin )modσ
σ
(3.4)
As navigation on a real-world scenario is normally performed horizontally on the xy
plane, we classify an obstacle as a surface that is close to perpendicular to xy, parallel
26
CHAPTER 3. CONCEPT AND IMPLEMENTATION
Algorithm 3.3 Points coloring and obstacle detection
for all points p in m do
min )
s ← f loor( abs(z−z
)
σ
abs(z−zmin )modσ
red ←
σ
redold ← cube[xscreen ][yscreen ][s]
if abs(redold − red) ∗ σ > σ then
cube[xscreen ][yscreen ][s] ← 1
p.color(red, 1, 1)
if redold > red then
p.z ← p.z + (redold − red) ∗ σ + 2
end if
else
cube[xscreen ][yscreen ][s] ← red
p.color(red, 1, 0.1)
end if
end for
to zy or zx. This is a simple assumption clearly present in the real world, simplifying
our obstacle detection algorithm in detecting possible vertical surfaces on a point cloud.
Although techniques for surface normals estimation have been developed [30], we
choose a simpler approach since we do not require a precise value for the normal, but
just an estimation of how paralel to the z axis it is. Figure 3.7 describes briefly this
estimative operation executed on algorithm 3.3. Points lined up vertically on the same
pixel are most likely to belong to a vertical surface.
To guarantee that on a given slice we will have at least two points that belong to an
obstacle lined up to be projected on that slice, we define the slice size σ as 3t. Since on
oversampled point clouds we have a distance of t between every point, a slice sized 3t
will certainly have at least two points lined up to be projected on it. On natural point
clouds, we define t as the minimum distance between two points. On these scenarios, the
distance between points is normally constant, except on places where we have scanning
flaws. So the minimum distance covers the well scanned surfaces of the point cloud.
Each position on the 3D array cube[w][h][σ] represents a voxel on object space, or
a pixel on the output image. This data structure is necessary to store for each voxel
which points will be projected onto a certain pixel, thus enabling us to perform obstacle
detection. The values written on this 3D array will not be used for map creation, since
they only store a single float value representing the maximum height of a point projected
on this position. Collision information is coded on each point red and blue channel.
At each step, after calculating the red value of the given point, we store this value
on a position of the array. If there is already a stored value on this voxel, the difference
between both red values is calculated, and transformed into an object-space distance
abs(redold − red) ∗ σ
If this difference is bigger than a certain small percentage of the size σ of the slice,
3.1. REPRESENTATION
27
Figure 3.7: Technique for surface orientation detection. Red points belong to a vertical
structure, grey points to the floor.
we assume that the points are vertically aligned, belonging to a vertical surface. These
points are marked on their blue channel with the value 1, as we should not disturb the
red channel that is used for height information. If an obstacle is not detected, we use a
default 0.1 value for the blue channel.
If redold is higher than the new value, we must increase the z coordinate of this new
point by adding the difference in height from the last point (redold − red) ∗ σ) and a
correction value 2 so it is not occluded by the prior point in rendering time.
This concludes the pipeline of operations needed to start the execution of Algorithm
3.1 that will create the slices on disk, that is the whole preprocessing stage of our work.
Some of the output slices from the pre-processing stage can be observed in Figure 3.8,
an office environment, where the floor has been correctly assigned as green, and all the
walls as white or light blue.
Figure 3.8: Three slices of an office environment, where walls and floor are clearly
distinguished, aswell as a section of a roof on the entrance.
28
CHAPTER 3. CONCEPT AND IMPLEMENTATION
3.2 Collision detection
During the last decade, the field of collision detection has evolved beyond the task of
preventing objects form overlapping. Deformable objects have been mentioned in several
works ([16] [4] [22] [42]) aswell as self-intersections([4] [6]) and even the effect of friction
[12] has already been implemented on a virtual scenario. But considering the context
given on Section 1.1 where we consider point clouds as a viable input, the challenge of
avoiding two objects to overlap is renewed.
The developed representation provides us with enough information to perform quick
collision detection on this environment. While the aspects concerning the realism of
the collision response have not been explored, precision on simple rigid bodies collision
detection has been achieved.
We divide the task of collision detection into two steps: a first step, that we call
Broad phase, where we verify the occurance of collisions between any objects in the
scene, and a second step called narrow phase, where we perform collision response.
3.2.1 Broad phase and collision detection
This task consists on identifying possible collisions between all objects on the scene. By
representing the avatar that will be navigating on the environment by an Axis Aligned
x
Bounding Box (AABB), we first calculate its size in pixels by calculating pixx ← size
t
size
and pixz ← t y , where threshold t was calculated as the pixel size. This will be the
number of pixels checked for collision on each slice, around the center of the pawn. The
range of slices that will be actually needed to calculate collision detection with the pawn
are given by:
zpawnmin + zpawn − zmin
(3.5)
σ
zpawnmax + zpawn − zmin
slicen ←
(3.6)
σ
These are the only images we will need to load into the memory at the same time in
order to perform collision detection. Since the transition between different heights while
exploring a model is normally linear, that implies that we do not have sudden jumps or
changes in height. Each time a new map is needed and it is not on RAM, we will load
them from the Hard disk.
We only load new slices onto memory until a user defined constant nslices is reached.
New slices beyond this point, replace an already loaded slice that has the furthest z value
from the avatar’s own z value, meaning it is not needed at this point of the execution.
While we want to minimize memory usage, we also want to load the slices from the disk
as few times as possible.
To detect possible collisions, we check from slice0 to slicen , the pixels repesenting
the bounding box of the avatar on its current position on image-space (xview , yview ). If
any checked pixel is not black, we mark the object as colliding, and will be processed in
a narrow phase. The process is explained in detail on Algorithm 3.4.
slice0 ←
3.2. COLLISION DETECTION
29
Algorithm 3.4 Broad-phase and collision detection
size
x
pixx ← size
, pixz ← t y
t
zpawnmin +zpawn−zmin
min
slice0 ←
, slicen ← zpawnmax +zpawn−z
σ
σ
for all slice s in (slice0 , ..., slicen ) do
if s not loaded then
load(s)
end if
pixx
x
for i ← xview − pix
2 to i ← xview + 2 do
pix
pix
for j ← yview − 2 y to j ← yview + 2 y do
if slice[s][i][j].color not black then
return TRUE
end if
end for
end for
end for
if nloaded > nslices then
unload()
end if
return FALSE
This implementation does not require any graphic card features, as all tests are done
CPU-wise. The worst case complexity of Algorithm 3.4 is O(pixx ∗ pixy ∗ s), depending
on the resolution of the image, the size of the avatar, and the density of the point cloud.
Alternative and arguably faster techniques that can be implemented on hardware are
discussed on Section 5.1
3.2.2 Narrow phase and collision response
As adressed earlier, our objective on this context is to simply avoid objects from overlapping, and provide with a basic navigation experience on the given environment. This
can be achieved with a simple extension to our broad-phase algorithm, by applying the
concepts of collision response from height maps, and collision avoidance [27]. Instead of
returning true when we find pixels that are not black, we gather information for collision
response each time we find colored pixels. The precision and quality of the process is
directly determined by the resolution of the created maps.
Obstacle collision
As the object might be moved on the (x, y) plane on response to colliding with an object,
we solve collisions with obstacles first, just then choosing the height value for the avatar.
Similar to the work from Loscos et. al [27], the avatar moves on fixed length steps, and
each time it collides, we correct it to the place he was on the previous check, that we
always assume as a valid position.
30
CHAPTER 3. CONCEPT AND IMPLEMENTATION
The size of this fixed length step is divided by two so the pawn can move a little closer
to the obstacle, and reset after no collision is detected. Pixels with the blue channel set
to 1 always represent an obstacle, except on applications where we want to enable the
avatar to climb small obstacles, as the agents from Loscos et.al [27]. On these situations,
we may ignore these pixels up until the height we want to be considered as climbable.
We apply this (x, y) correction each time an obstacle pixel is found on the process
described in Algorithm 3.4, until all the pixels representing the avatar’s bounding box
are verified.
Floor interaction
Height is defined exactly as it is on height maps. Multiplying the coded height information on the red channel by σ and adding the z base coordinate of the given slice, we have
precise information about the given point’s height. Collision response can be made by
setting the final height to the average height of the points on the base of the bounding
box, or by the maximum value. Here also we check for surface height values from the
first slice until the height we want to consider as climbable.
The complexity of this operation is exactly O(pixx ∗ pixy ∗ s), but without adding
any extra operation from the broad phase checking. Faster implementations and considerations about multi-avatar scenarios will be discussed on section 5.1
3.3 Summary
This chapter has covered completely the process of image-based collision detection on
point clouds and polygonal models oversampled into point clouds using our new multilayered approach. By assiging colors to each point at the pre-processing stage, identifying
obstacles by estimating how close their normals are to the z axis, we write information
about the whole input structure on several 2D projections of 3D volumes sized σ. The
number of images created, the precision of the collision detection progress, and the
density of the point cloud are all values determined by the user when the resolution of
the images is chosen. This provides us with a powerful adaptation capability, regarding
different scenarios and machines.
Chapter 4
Experimental Results
4.1 Environment and settings
We have implemented the whole algorithm using OpenGL 2.1.2, C and the OpenGL
Utility Toolkit (GLUT) to deal with user input and the base application loop managing.
OpenGL is a standard across different computing architectures and operating systems,
and our solution is aimed to be efficient without depending on the environment, making
it a natural choice.
The platform used for testings is a computer with a Intel core 2 Duo CPU at 2 GHz
with 2GB of RAM, a NVIDIA GeForce 9400 adapter, running Microsoft Windows Seven
x86. Seven models have been used for testing, each representing different scenarios and
styles of modeling. None of them was tailored in a specific way to our application, as
most of them were downloaded from free 3D models repositoriums on the web [2] [1]. A
more detailed overview of them can be seen on Table 4.1
Although some models such as Office (Figure 4.1a) have a low complexity, the polygonal oversampling process creates even more dense point clouds than our real world
inputs, Room (Figure 4.1f) and Entrance (Figure 4.1g). Results of this oversampling are
discussed on Section 4.4 together with obstacle detection, and presented on Table 4.4.
Section 4.2 will discuss the pre-processing stage speed and memory usage on different
models, resolutions, and point coloring options. These will give us an overview of the
scalability of the representation, and how these different settings affect our performance.
Section 4.5 will evaluate the results of collision detection regarding precision on different
scenarios, using a simple rectangular pawn representing a bounding box of a given avatar
as an user controlled pawn that navigates the environment on an established path.
Section 4.6 will add all the results together and make a critical evaluation of the developed work, regarding its applicability on the described scenario and other considered
possibilities, and how does it perform when compared to other techniques described in
Chapter 2 regarding the tested aspects.
31
32
CHAPTER 4. EXPERIMENTAL RESULTS
(a) Office
(b) Church
(c) Sibenik
(d) Columns
(e) Streets
(f) Room
(g) Entrance
Figure 4.1: Models used as input for testing.
4.2. PRE-PROCESSING TIME AND MEMORY USAGE
33
Table 4.1: Features of the models used for evaluation
Model
Office
Type
Polygonal
Church
Polygonal
Sibenik
Polygonal
Columns
Polygonal
Room
3D laser Scan
Street
Polygonal
Entrance
3D laser Scan
Complexity Details
17.353 pts
Office environment with cubicles
and hallways
26.721 pts
Simple church with stairs and
columns
47.658 pts
Cathedral of St. James on Sibenik,
Croatia
143.591 pts Big environment with localized
complexity.
271.731 pts 3D Scan of a room with chairs, a
table, and other objects.
281.169 pts Outside street environment with an
irregular floor, houses, and several
objects.
580.062 pts Entrance of the Batalha monastery
in Portugal.
4.2 Pre-processing time and Memory usage
Creating the 3D representation during the pre-processing stage is the task on the pipeline
that is most cpu intensive, since after the representations are created, the application
tends to remain stable with regular frame rate and memory usage, only to be eventually
disturbed by the process of loading new slices. Four rounds of tests have been done
for each model using two different image resolutions (700x700 and 350x350), performing
obstacle detection (signaled by a + on Figures 4.2 and 4.3) and one without it (signaled
by a - on Figures 4.2 and 4.3). This provides us with enough information about the key
variables that make the difference on the efficiency of this stage. Resolution, obstacle
detection, and polygonal oversampling.
Figure 4.2 shows the time taken on the whole preprocessing stage for each model
and configuration. The clock is started as OpenGL is setting up, and stops when every
image has been written to disk. By analysing the graph, we conclude that the most cpu
demanding task on the pipeline is polygonal oversampling, since the two other tested
inputs, the point clouds, were the faster inputs to process, besides having a higher point
complexity than the original polygonal models. More detail on this task will be given
on Section 4.4.
The increase on processing time with point clouds is linear to point complexity, since
the time has doubled between Room and Entrance, while they have close to a 2:1 ratio
on their point count. This linear growth is expected since each point must be checked
for coloring once, and also for common input processing such as input file reading and
display list creation.
There is also an expected extra cost in time for obstacle calculation, and this is
34
CHAPTER 4. EXPERIMENTAL RESULTS
Figure 4.2: Graph picturing average time values in seconds for the pre-processing stage
of each model and configuration.
mainly due to the time spent on memory allocation and deallocation that is necessary
for this task, since the amount of extra operations it needs to perform comparing to
simple point coloring is constant. Memory allocation is a heavy system call which we
cannot assign an exact cost, since it depends on the state of the memory at the given
time. As we can see in Table 4.3 the amount of extra memory needed on a higher
resolution is larger, explaining the greater impact this operation has on these situations.
Regarding overall memory cost, by comparing Figure 4.3 with Table 4.4, we find that
memory scales according to the size of the point cloud. The input model Office, besides
being the most simple produces the most complex point cloud with over 9 millions points
after oversampling, and consumes the highest amount of resources on this stage. This is
a factor not generated by our algorithm but by the complexity of the produced output
itself, which needs to be stored in memory for further calculations. Memory occupied
specifically by our algorithm is mostly for obstacle calculation. On situations where this
is not a required task, the resources are mostly being consumed by simple object storage.
4.3 Runtime
During the application runtime, memory consumption varies according to the number
of loaded slices onto the RAM, but by controlling nslices we can avoid this value from
going over the memory we wish the application to consume. On a 700x700 resolution,
the minimal value found was 81,92MB and the maximum 143,36MB , while on 350x350
values were between 61,44MB and 122,88MB. The complexity of the model has a much
4.3. RUNTIME
35
lighter impact here, being only noticable on tasks that are unrelated to collision detection
such as rendering and shading.
Table 4.2 shows the average frame-rate during the execution of our application for
each model. Two tests were made for each model, one where we performed collision
detection, and the other where we did not. Results show that our algorithm did not
affect the rendering speed of the interactive application at all, environments where the
frame rate was below the values considered minimum for interaction would be on this
situation with any other collision detection algorithm applied to it. And this low framerate on these situations was only due to other process on the visualization cycle such as
shading, present on polygonal models such as Street but not on the point clouds. This
shows that our technique is clearly not the bottleneck of the visualization cycle, one of
the main concerns presented on Section 1.1.
Table 4.2: Average frame-rate during evaluation
Model
Office
Church
Sibenik
Colums
Street
Room
Entrance
Collision
60 fps
60 fps
60 fps
30 fps
19 fps
60 fps
30 fps
Simple
60 fps
60 fps
60 fps
30 fps
19 fps
60 fps
30 fps
The amount of memory needed to perform collision detection on a very dense point
cloud is close to the same needed on a simple one. Figure 4.4 shows the consumption of
memory with the minimum loaded number of slices needed, and confirms the excelent
scalability of our technique in this scenario.
As stated before, the preprocessing stage can be executed only once for a given
configuration, as every generated image is written to disk and can be loaded on further
interactions. After the first interaction for any input model, our technique does not need
any setup time to work, so the linear growth on time and memory seen on Figures 4.2
and 4.3 will only be applied once, leaving further interactions with the scalable behaviour
shown on Figure 4.4.
36
CHAPTER 4. EXPERIMENTAL RESULTS
Figure 4.3: Graph picturing memory cost in megabytes during the pre-processing stage
of each model and configuration.
Figure 4.4: Memory used by the complete application at a given moment during runtime.
Table 4.3: Memory used for obstacle detection
Model
Office
Church
Sibenik
Colums
Street
Room
Entrance
700x700
65,54 MB
151,55 MB
249,86 MB
45,06 MB
122,88 MB
51,2 MB
69,63 MB
350x350
20,48 MB
16,38 MB
38,86 MB
8,19 MB
16,38 MB
20,48 MB
16,38 MB
4.4. POLYGONAL OVERSAMPLING AND OBSTACLE DETECTION
37
4.4 Polygonal oversampling and Obstacle detection
Part of the complexity of polygonal oversampling as described on Algorithm 3.2 lies
on the resolution needed for the output images, since our approach is to try to obtain
a perfect point cloud that will fill all the pixels on the destined image sized (w, h).
Resolution affects greatly the complexity of the produced point clouds as shown on
Table 4.4, where we can see that in some cases such as Church and Office the size of
the output has tripled, while we only doubled the resolution. For street, this ratio is
different, as the size of the output only doubled.
The number of produced points will not scale linearly with the size of the output,
because the algorithm does not create a fixed number of new points on each step. If we
would create x new points for each input triangle, we would have a highly irregular point
cloud, since triangles of a 3D model do not share the same size. Our process creates as
many points as necessary to fit the calculated threshold, and this is harder to achieve in
some models than others.
The input model Columns (Figure 4.1d) besides having the second highest triangle
complexity at start, it produces the simplest point cloud after oversampling. Most of
its triangles are concentrated on a certain area on the middle of the model, and the
rest being covered by a flat and simple surface, wasting precious resolution. Analysing
the produced maps on Figure 4.5 we see that the first map from the left fully utilizes
the determined 750x750 resolution, while the second from the left uses 250x250, and
that value goes down to 170x170 on the other two. The reduced effective resolution
diminishes the need to subsample the triangles, producing a less dense point cloud.
Obstacle detection showed good results on all scenarios, identifying correctly every
surface aligned with the z axis with isolated errors on single pixels that do not affect
the collision detection process. However, due to the nature of the algorithm, we are
unable to precisely state what is the limit angle on a surface from where it starts to be
identified as an obstacle. The closer it is to being aligned with the z, more pixels start to
be marked as obstacles. This results have proved to be precise enough for the simulation
on Church and to be performed without errors while the pawn was going through the
ramp.
although loss of precision happens on certain situations where we have localized
complexity, on rich and homogeneous environments, the oversampling operation provides
us with detailed point clouds that produce maps able to represent every obstacle and
surface with fidelity. On Figure 3.8 we can see perfectly detailed walls, and Figure
4.5 shows columns and steps with high precision aswell, fulfilling the purpose of the
oversampling that was creating visually closed surfaces just like the original polygons,
so the image representation kept as much detail as possible.
38
CHAPTER 4. EXPERIMENTAL RESULTS
Figure 4.5: Sequence of the first four maps of the input model Columns.
Table 4.4: Polygonal oversampling results
Model
Office
Church
Sibenik
Columns
Street
Original
17.353 pts
26.721 pts
47.658 pts
143.591 pts
281.169 pts
350x350
3.088.193
2.246.549
1.838.167
1.448.383
3.606.225
pts
pts
pts
pts
700x700
9.349.585
6.475.125
5.199.093
2.612.303
7.142.361
pts
pts
pts
pts
pts
4.5 Collision detection and precision
Results on collision detection have been verified through establishing a fixed route to
navigate with the pawn where it goes through different situations and scenarios. The first
path is on Office where straight hallways and doors are tested, the second tests climbing
stairs and going through a ramp on Cathedral, and the third interaction verifies irregular
floors and small and round obstacles, on a section of Street.
These scenarios have been tested with the ordinary two different resolutions setup,
and for the steps and ramps experiment we have used two different values of nslices ,
so we could study the effect of reading a map on runtime. A high nslices keeps more
images on RAM at the same time , while a low value has to read maps from disks more
frequently, but consumes less memory.
Tests on Cathedral have showed us that reading from the disk on runtime has a
bigger impact on efficiency than storing a higher number of images. When reading a
high resolution image from disk, we notice a sudden drop on the frame-rate, and this
is well noticed when the pawn falls from a higher structure. Increasing nslices to store
enough slices to represent the ground floor and the platform on top of the steps, little
to no difference was noticed on memory load, and the interaction was a lot smoother.
On a low resolution though, reading from disk on runtime showed no impact on the
performance.
The technique described on 3.2.2 for narrow phase and collision response does not try
to achieve maximum precision, since our objective is to have a simple interactive application with a real-world model. However, our technique has presented precise results on
all scenarios and resolutions. Floor collision has shown to be highly reliable due to the
4.6. EVALUATION
39
fact that we can represent a great range of values for each pixel even on low resolutions.
Both tested scenarios showed perfect results, where the avatar has always walked exactly
on top of each structure during the whole walkthrough as shown on Figure 4.6.
Collisions with obstacles are more affected by resolution, since we rely on pixel finess
to precisely know the position of a given surface. Tests on office and street have showed
the same errors of small object interference or fake collisions due to diffuse information
about object boudaries. These are more noticable on the interactions with round objects
on Street shown also on Figure 4.6, where we can notice the aliasing creating invisible
square barriers around a perfectly round object.
Figure 4.6: Low resolution scenario: small round object interaction and floor collision.
4.6 Evaluation
The developed technique has been successfully applied on the presented scenarios with
low pre-processing times and memory usage, while not affecting the frame-rate or requiring too much memory during the interaction stage. These results also proved themselves
to be scalable and able to be applied not only on simple polygonal models, but on dense
point clouds aswell. Precision-wise, our technique fits the objectives stated by us of a
simple and light navigation completing the simple task of avoiding object overlap, and
still having navigation precision as a traditional height map.
However, there are several outstanding works on the scientific community that perform collision response with much more detail, precision, and perform more complex
tasks such as deformation and self collisions [21] [3] [40]. But as stated on Section 3.1,
none of these are applied on our specific described scenario of point clouds and interactive navigation and exploring, making a comparison on precision between these subjects
not only unfair to both sides, but unfeasible aswell.
Altough we showed that we provide fast and precise information to perform broadphase collision detection with close to no extra load, applications with higher standards
on collision response are still capable of applying other image-based techniques to the
narrow-phase, with higher precision such as the work presented by Faure et. al [3], that
is completely image-based for its narrow-phase, and performs collision response with
40
CHAPTER 4. EXPERIMENTAL RESULTS
excellence.
The work on point cloud collision detection using boundig voulme hierarchies from
Figueiredo et. al [13] has been tested on one of our experimental scenarios, the Entrance
of the Batalha Monastery, on a machine with a slightly faster processing speed and
RAM than the one used for our walkthoughs. Table 4.5 shows a comparison between
both results, where we picked Oct 4096 for results, since it was shown in the paper to
be the one with the best performance.
Table 4.5: Comparison between point cloud techniques
Frame-rate
Total Memory
Pre-processing time
Image-based
30 fps
143.36 MB
13.9 s
BVH’s
16 to 30 fps
225,44 MB
1500 s
Frame-rate was disturbed during the collision detection process on the R-tree aproach,
while it remained steady at the 30 fps during the whole execution of our application.
Also, the image-based technique has required much less memory to be executed, even
with a high number of slices loaded onto memory. The biggest difference is in the preprocessing times. Our approach was executed 107 times faster than the BVH approach,
and most importantly, this pre-processing stage must only be performed once for each
configuration, since the representation is written to the hard disk and can be used on
further interactions.
As stated on Section 2.3 the research on point cloud collision detection is recent,
and inexistent regarding image-based techniques. Our pioneer solution has presented
excellent results, not only performing better than other works on point clouds published
in the scientific community, but also being flexible enough to be applied on models from
CAD, or combined with precise collision response techniques. Without adding any load
to the visualization pipeline, our technique is not only scalable with input complexity, but
also with hardware capabilities. Image-based collision detection can be performed with
our representation on any computer that can render the input model at an acceptable
frame-rate, without requiring anything meaningful from the CPU or GPUs.
Chapter 5
Conclusion and Future work
A new image-based environment representation for collision detection has been presented, using 2.5+D slices of an environment or building across the z axis. These images
contain at each pixel, information about a certain voxel, representing it’s contents with
colors. Height map information is stored on the red channel, and obstacles are indicated
on the blue channel. This allows us to have a Broad phase collision detection stage that
is performed with high efficiency and scalability, where the user can choose the precision
according to the computing power at hand by simply adjusting the resolution of the produced images. Point clouds and polygonal models are ubiquitously processed, making
our approach the top alternative for fast interaction with massive laser scan data.
Combining well with several existent aproaches for narrow-phase collision response,
but also presenting a new and precise technique for this task, this work fills a gap in the
area of collision detection, exploring a scenario that has been receiving more attention
recently. By solving one of the many issues we face while working with virtual reality
and simulation, we have presented one more argument in favor of point clouds as a viable
alternative to classical polygon representation in certain specific application scenarios.
5.1 Future work
Our work has fulfilled all of its objectives, however, there are several aspects that still
can be improved or fine tuned not only to make it faster or more precise, but also to
explore several different application where this new approach can be applied.
As stated on Section 1.1, the advance of nowadays graphic cards are transforming
yesterday’s special features into the basic kit of an average personal computer. Since
GPUs are designed to process images, we can use them to move some of the calculations
away from the CPU to improve the algorithm’s performance. Our broad phase technique
performs several checks to single pixels of the images loaded onto memory. This task
could be executed faster by doing a stencil test to determine if there are any pixels on
the area covered by the pawn. If any pixel would pass, a collision could be imediatly
detected.
Also, the point coloring algorithm still requires a large amount of memory to simulate
41
42
CHAPTER 5. CONCLUSION AND FUTURE WORK
rendering buffers in memory, where we project the points onto virtual pixels, and check
if they are occupied to perform obstacle detection, then assigning them an according
color. By implementing this as a vertex shader, we would do exactly the same operation,
without allocating extra memory, but using the buffers themselves. As each vertex was
processed, we would be checking and writing colors from the frame buffer, performing
the same operation without needing any extra memory allocation.
Although this representation was tested specifically on a scenario where we have one
rich environment and only one avatar, it may be applied to several different situations.
A multi-user scenario is possible, performing inter-object collision. This can be achieved
by cross-checking both image representations for occupied pixels. Also, not limiting the
sliced representation only to structures, but also applying it to the pawns and any other
object on the scene. Applying this layered representation to rich and detailed objects
such as the famous statue of David or the Stanford bunny and creating an interactive
application with it can be viable given the great scalability of our technique.
Another improvement that can be done is to try to solve the loss of effective resolution
that was detected on the test model called ”Columns”. A solution that uses non-uniform
resolution images, that meaning they use more pixels to store information about complex
sections of the model, and group pixels containing the same information on regions where
there are no objects. Similar ideas to some image compression techniques like Run lenght
encoding could be applied to achieve this objective.
Image comparison techniques can also be used to discard slices that do not add much
information to the last loaded slice, not only saving memory space, but possibly reducing
drastically the number of times our application needs to perform heavy disk reads. On
scenarios like Church and Sibenik that have a big number of slices representing similar
information about walls and columns, this could even lessen the number of disk reads
to zero.
Bibliography
[1] http://artist-3d.com - free 3d models.
[2] http://www.3dmodelfree.com - free 3d model warehouse.
[3] Jérémie Allard, François Faure, Hadrien Courtecuisse, Florent Falipou, Christian
Duriez, and Paul G. Kry. Volume contact constraints at arbirtrary resoltion. ACM
Transactions on Graphics, 29(4), July 2010.
[4] George Baciu and Wingo Sai-Keung Wong. Hardware-assisted self collision for
deformable surfaces. Proceedings of the ACM symposium on Virtual reality software
and technology, 2002.
[5] George Baciu, Wingo Sai-Keung Wong, and Hanqiu Sun. Recode: An image-based
collision detection algorithm. Computer Graphics and Applications, 1998.
[6] Niels Boldt and Jonas Meyer. Self-intersectios with cullide. Eurographics, 23(3),
2005.
[7] Stephen A. Ehmann. Swift: Accelerated proximity queries using multi-level voronoi
marching. Technical report, 2000.
[8] Hoff et al. Fast and simple 2d geometric proximity queries using graphics hardware.
Symposium on Interactive 3D Graphics, 2001.
[9] Kochara et al. Collision detection: A survey. Systems, Man and cybernetics, 2007.
[10] Stephen J. Guy et al. Clearpath. Eurographics / ACM SIGGRAPH Symposium on
Computer Animation, 2009.
[11] Chun fa Chang, Gary Bishop, and Anselmo Lastra. Ldi tree: A hierarchical representation for image-based rendering. SIGGRAPH, 1999.
[12] François Faure, Sébastien Barbier, Jérémie Allard, and Florent Falipou. Imagebased collision detection and response between abitrary volume objects. Eurographics / ACM SIGGRAPH Symposium on Computer Animation, 2008.
[13] Mauro Figueiredo, João Oliveira, Bruno Araújo, and João Pereira. An efficient
collision detection algorithm for point cloud models. 20th International conference
on Computer Graphics and Vision, 2010.
43
44
BIBLIOGRAPHY
[14] S. Gottschalk, M. C. Lin, and D. Manocha. Obbtree: A hierarchical structure for
rapid interference detection. SIGGRAPH, 1996.
[15] Naga K. Govindaraju, Stephane Redon, Ming C. Lin, and Dinesh Manocha. Cullide:
Interactive collision detection between complex models in large enviroments using
graphics hardware. Graphics Hardware, 2003.
[16] Bruno Heidelberger, Matthias Teschner, and Markus Gross. Real-time volumetric
intersections of deforming objects. VMV, 2003.
[17] Philip M. Hubbard. Approximating polyhedra with spheres for time-critical collision
detection. ACM Transactions on Graphics, 15(3):179–210, July 1996.
[18] Michael Kazhdan, Matthew Bolitho, and Hugues Hoppe. Poisson surface reconstruction. Eurographics Symposium on Geometry Procession, 2006.
[19] Young J. Kim, Miguel A. Otaduy, Ming C. Lin, and Dinesh Manocha. Fast penetration depth computation for phisically-based animation. Prceedings of SIGGRAPH
Symposium on Computer Animation, 2002.
[20] Jan Klein and Gabriel Zachmann. Point cloud collision detection. Eurographics, 23,
2004.
[21] Dave Knott and Dinesh K. Pai. Cinder - collision and interference detection in
real-time using graphics hardware. Proceedings of Graphics Interface, 2003.
[22] Thomas Larsson and Tomas Akenine-Möller. Collision detection for continuously
deforming bodies. Eurographics, 2001.
[23] William E. Lawrence and Harvey E. Cline. Marching cubes: A high resolution 3d
surface construction algorithm. Computer Graphics, 21(4), 1987.
[24] Ming C. Lin and John F. Canny. A fast algorithm for incremental distance calculation. IEEE, 1991.
[25] Lars Linsen. Point cloud representation. 2001.
[26] Dani Lischinski and Ari Rappoport. Image-based rendering for non-diffuse synthetic
scenes. Rendering Techniques, 1998.
[27] Céline Loscos, Franco Tecchia, and Yiorgos Chrysanthou. Real-time shadows for
animated crowds in virtual cities. Proceedings of the ACM symposium on Virtual
reality software and technology, pages 85 – 92, 2001.
[28] MeshLab. http://meshlab.sourceforge.net/, 2011.
[29] Brian Mirtich. V-clip: Fast and robust polyhedral collision detection. ACM Transactions on Graphics, 17(8), 1998.
BIBLIOGRAPHY
45
[30] Niloy J. Mitra and An Nguyen. Estimating surface normals in noisy point cloud
data. Proceedings of the 19th annual symposium on Computational geometry, 2003.
[31] Matthew Moore and Jane Wilhelms. Collision detection and response for computer
animation. Computer Graphics, 22(4), August 1988.
[32] Karol Myszokowski, Oleg G. Okunev, and Tosiyasu L. kunii. Fast collision detection
between complex solids using rasterizing graphics hardware. The Visual Computer,
11(9):497 – 512, 1995.
[33] Timothy S. Newman and Hong Yi. A survey of the marching cubes algorithm.
Computer and Graphics, 2006.
[34] Sarudin Kari Noralizatul azma BT Mustapha, Abdullah Bin Bade. A review of
collision avoidance technique for crowd simulaion. International conference on information and multimedia technology, 2009.
[35] Jan Ondrej, Julien Pettré, Anne-Hélène Olivier, and Stéphane Donikian. A
synthetic-vision based steering approach for crowd simulation. ACM Transactions
on Graphics, 29(4), July 2010.
[36] Craig W. Reynolds. Flocks, herds, and schools: A distributed behavioral model.
SIGGRAPH, 1987.
[37] F.Faure S. Kimmerle, M.Nesme. Hierarchy accelerated stochastic collision detection.
VMV, 2004.
[38] Jonathan Shade, Steven Gortler, Li wei He, and Richard Szeliski. Layered depth
images. Proceedings of the 25th annual conference on Computer graphics and interactive techniques, 1998.
[39] Marc Levoy Szymon Rusinkiewicz. Qsplat: A multiresolution point rendering system for large meshes. SIGGRAPH, 2000.
[40] M. Teschner, S. Kimmerle, B. Heidelberger, G. Zachmann, L. Raghupathi,
A. Fuhrmann, M.-P.Cani, F. Faure, N. Magnenat-Thalmann, W. Strasser, and
P.Volino. Collision detection for deformable objects. Eurographics State-of-the-Art
Report, 2004.
[41] S. Uno and M. Slater. The sensitivity to presence to collision response. Virtual
Reality Annual International Symposium, 1997.
[42] Xinyu Zhang and Young J. Kim. Interactive collision detection for deformable
models using streaming aabbs. IEEE Transactions on Visualization and Computer
Graphics, 13(2), March 2007.