Collision Overload: Reducing the Impact in Real-time
Physics
Final Report
Ian Robert Ballantyne
Supervisor: Tony Field
Second Marker: Paul Kelly
c Ian Ballantyne, 2007. All Rights Reserved
Copyright To my Parents, Jackie and Colin,
and my brother Jamie
for their continued support throughout my studies.
Acknowledgements
I’d like to thank the following people for their contributions:
• My supervisor, Tony Field - For helping me focus my ideas and getting excited about the
results.
• My friends, Will, John, Joel and Richard - For their comments and discussions on colliding
objects and the challenges of implementing a solution.
• My fellow computing students, Dave and Islay - For their interest in my work and ideas.
• The Computing Support Group - For providing me with a dedicated machine to “crash
boxes together”.
Trademarks
The following trademarks are mentioned at various points during this paper:
• DirectX and Direct3D are trademarks of Microsoft Corporation.
• PhysX is a trademark of AGEIA Technologies Inc.
• Havok FX and Havok Physics are trademarks of Havok.com Inc.
• GeForce, SLI and CUDA are trademarks of NVIDIA Corporation.
• Radeon is a trademark of ATI Technologies Inc
Abstract
In the field of real-time physics simulations we have a dilemma. Users want to experience
realistic object interactions at a high “level of detail”. Achieving this involves a trade-off between
performance and accuracy for the designers. The solution is to find a “level of detail” that
provides suitable performance for a nominal execution cost. Designers are aware of the sporadic
nature of physics and so “err on the side of caution” by using simpler physical representations.
This project attempts to reduce the gap in performance between normal execution and the
more complex cases where many objects collide together at a single point. We have called this
the collision overload problem. We identify “collision detection” as the main bottleneck in physics
simulations. We propose a method for dynamic switching among different “levels of detail”, based
on testing safety conditions (encapsulation levels), that ensure the stability of the simulation.
The success of the technique relies on identifying when to switch models and investigating how
the different changes effect performance. The report proposes the concept of global, group and
local scope for decision making and evaluates the alternatives in an implementation based on the
Bullet physics engine. We show that our technique can improve the situation and we also provide
a framework called Scatter that can be used to further analyse dynamic model reduction.
Contents
1 Introduction
1.1 The Collision Overload Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
17
18
20
2 Background
2.1 Physics System Terminology . . . . . . . . . . . . . . . . . . . .
2.1.1 Basics of Physics . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Newtonian vs. Lagrangian Dynamics . . . . . . . . . . .
2.1.3 Rigid-Bodies vs. Deformable-Bodies . . . . . . . . . . .
2.2 Physics Concepts . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Linear Complementarity Problem (LCP) . . . . . . . . .
2.2.2 The Gilbert-Johnson-Keerthi Distance Algorithm (GJK)
2.3 Collision Detection . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Broad-phase . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1.1 Bounding Spheres . . . . . . . . . . . . . . . .
2.3.1.2 Oriented Bounding Boxes (OBB) . . . . . . . .
2.3.1.3 Axis-Aligned Bounding Boxes (AABB) . . . .
2.3.2 Narrow-phase . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Continuous/Discrete Collision Detection (CCD & DCD)
2.4 Collision Response . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Resolving Contact . . . . . . . . . . . . . . . . . . . . .
2.5 Simulation Loops . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6 Game Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7 Real-time Physics Simulator Design . . . . . . . . . . . . . . .
2.7.1 Physics Time-step In Detail . . . . . . . . . . . . . . . .
2.7.1.1 Unconstrained Motion . . . . . . . . . . . . . .
2.7.1.2 Collision Detection . . . . . . . . . . . . . . . .
2.7.1.3 Non-Contact Constrained Motion . . . . . . .
2.7.1.4 Collision Response: Contact Constraints . . .
2.7.1.5 Integrators . . . . . . . . . . . . . . . . . . . .
2.7.2 Modular Design . . . . . . . . . . . . . . . . . . . . . . .
2.8 Existing Physics Technologies . . . . . . . . . . . . . . . . . . .
2.8.1 Commercial . . . . . . . . . . . . . . . . . . . . . . . . .
2.8.1.1 Havok Physics . . . . . . . . . . . . . . . . . .
2.8.1.2 AGEIA PhysX (Formally Novodex) . . . . . .
2.8.2 Open Source . . . . . . . . . . . . . . . . . . . . . . . .
24
24
24
25
25
26
26
27
28
28
29
29
29
29
29
30
30
31
32
33
33
33
34
34
34
34
34
36
36
36
36
36
10
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
CONTENTS
2.8.2.1 Open Dynamics Engine . . . . .
2.8.2.2 Bullet Physics Library . . . . . .
2.9 Level of Detail for Physics . . . . . . . . . . . . .
2.10 Hardware for Physics . . . . . . . . . . . . . . . .
2.10.1 GPU . . . . . . . . . . . . . . . . . . . . .
2.10.1.1 Pipeline Architecture . . . . . .
2.10.1.2 Vertex Processor . . . . . . . . .
2.10.1.3 Fragment Processor . . . . . . .
2.10.1.4 Geometry Shader . . . . . . . .
2.10.1.5 CUDA . . . . . . . . . . . . . .
2.10.1.6 Appropriate Physics Utilisations
2.10.1.7 Limitations . . . . . . . . . . . .
2.10.2 PPU . . . . . . . . . . . . . . . . . . . . .
2.10.2.1 Architecture . . . . . . . . . . .
2.10.2.2 Limitations . . . . . . . . . . . .
2.10.2.3 Appropriate Physics Utilisations
2.11 Report Terminology . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
37
38
38
38
38
39
39
40
40
40
41
41
41
41
41
42
3 Investigation
3.1 Physics Engine Analysis . . . . . . . . . . . . . . . . .
3.1.1 Case Study: Bullet Physics Library . . . . . . .
3.1.1.1 Modular Design . . . . . . . . . . . .
3.1.1.2 Algorithms . . . . . . . . . . . . . . .
3.1.1.3 Improved Performance . . . . . . . .
3.1.2 Bottlenecks of Physics Simulators . . . . . . . .
3.1.2.1 Full Narrowphase Intersection Testing
3.1.2.2 Unused Calculation Results . . . . . .
3.1.2.3 Maintaining Structures . . . . . . . .
3.1.2.4 Excessive Contact Points . . . . . . .
3.1.2.5 Complex Intersection Algorithms . . .
3.1.2.6 Generally Avoiding Bottlenecks . . . .
3.2 The Collision Detection Bottleneck . . . . . . . . . . .
3.3 Solution Methods . . . . . . . . . . . . . . . . . . . . .
3.3.1 Parallelising Calculations . . . . . . . . . . . .
3.4 Level of Detail . . . . . . . . . . . . . . . . . . . . . .
3.4.1 User Perception . . . . . . . . . . . . . . . . . .
3.4.2 Encapsulation Levels . . . . . . . . . . . . . . .
3.4.3 Requesting a Level of Detail . . . . . . . . . . .
3.4.4 Global, Group and Local Policies . . . . . . . .
3.4.5 Investigation by Implementation . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
44
45
45
46
46
47
47
47
48
48
48
48
49
52
52
53
54
54
55
57
58
4 Implementation
4.1 The “World” Model . . . . . . . .
4.2 Timing and Game Loops . . . . .
4.3 Built-in Profiling . . . . . . . . .
4.4 Scatter API . . . . . . . . . . . .
4.5 Integrating Encapsulation Levels
4.5.1 Hybrid World . . . . . . .
4.5.2 btHybridDynamicsWorld
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
60
61
62
64
64
65
65
67
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
CONTENTS
4.5.2.1 Additions To the Bullet Loop .
4.5.2.2 Collecting Local Heuristics . .
4.5.2.3 Problems in Implementation .
Successful Switching . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
67
67
67
68
5 Evaluation
5.1 Performance Evaluation . . . . . . . . . . . . .
5.1.1 Test Conditions . . . . . . . . . . . . . .
5.1.2 Test Scenarios . . . . . . . . . . . . . .
5.1.2.1 Scene 1 . . . . . . . . . . . . .
5.1.2.2 Scene 2 . . . . . . . . . . . . .
5.1.3 Performance Measures and Expectations
5.1.4 Expectations . . . . . . . . . . . . . . .
5.1.5 Results . . . . . . . . . . . . . . . . . .
5.1.5.1 Test 1 and Test 2 . . . . . . .
5.1.5.2 Test 3 and Test 4 . . . . . . .
5.1.6 Observations . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
70
70
70
71
71
71
74
75
75
75
77
78
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
and Model Switching
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
88
88
89
90
90
91
92
92
4.5.3
6 Summary and Conclusion
6.1 Performance . . . . . . . . .
6.2 Scatter . . . . . . . . . . . .
6.3 Level of Detail . . . . . . .
6.3.1 Encapsulation Levels
6.4 Implementation . . . . . . .
6.5 Future Work . . . . . . . .
6.6 Discussion . . . . . . . . . .
Bibliography
94
List of Figures
1.1
1.2
Left: A stable tower of simulated blocks. Right: Disrupting the tower. . . . . . .
The Collision Overload Problem exhibited in a game . . . . . . . . . . . . . . . .
17
19
2.1
2.2
2.3
2.4
A diagram of the Minkowski sum of a circle and a square . . . . .
Three examples of bounding volumes: Spheres, OBBs and AABBs
The interactions of modern game loop . . . . . . . . . . . . . . . .
Pseudo Code: Simplified Physics Time-step . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
28
28
32
35
3.1
3.2
3.3
3.4
3.5
3.6
3.7
The modular design concept by Erleben (a), applied to Bullet (b) .
Breakable objects diving into constituent elements. . . . . . . . . .
Possible data parallelisation in Bullet . . . . . . . . . . . . . . . . .
A diagram showing “Encapsulation Levels” of a lamp and a mug .
The problems of increasing level of detail without safety conditions
The problems of decreasing level of detail without safety conditions
Global, group and local decision making for model switching . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
50
53
56
56
56
58
4.1
4.2
4.3
The world model used in a simulation application . . . . . . . . . . . . . . . . . .
The main Scatter loop and timing system. . . . . . . . . . . . . . . . . . . . . . .
The flow of requests and switches across the Scatter/Bullet boundary. . . . . . .
62
63
66
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
Scene
Scene
Scene
Scene
Scene
Scene
Scene
Scene
Scene
Scene
Scene
1 : Left: the Hunter mesh. Right: the Fighter mesh . . . . . . . . . . . . . 72
2: Spaceships (Hybrid Models), Asteroids (Basic Spheres) and the Sun (Large Sphere). 73
1: Key frames of the Hybrid Build . . . . . . . . . . . . . . . . . . . . . . . 76
1: Key frames of the Hybrid Build . . . . . . . . . . . . . . . . . . . . . . . 77
1: Manifolds and Narrowphase CPs. Top: Default Build. Bottom: Hybrid Build. 80
1: Detail: Manifolds and Narrowphase CPs. Top: Default Build. Bottom: Hybrid Build. 81
2: Manifolds and Narrowphase CPs. Top: Default Build. Bottom: Hybrid Build. 82
2: Total Objects and Contact Points (Both builds) . . . . . . . . . . . . . 83
2: Peak Detail: Total Objects (Both builds) . . . . . . . . . . . . . . . . . 83
2: Physics profiling. Top: Default Build. Bottom: Hybrid Build. . . . . . . 84
2: Peak Detail: Physics profiling. Top: Default Build. Bottom: Hybrid Build. 85
14
Chapter 1
Introduction
“In physics, you don’t have to go around making trouble for yourself - nature does it
for you.”
Frank Wilczek
Physics has long been used to bring structure to our seemingly chaotic view of nature. Mechanics,
more specifically dynamics, has allowed us to visualise the real-world in terms of forces and
direction, turning moving objects into variables of mass, momentum and velocity. By combining
geometry, classical mechanics and numerical methods we can create models that represent a
subset of the world, allowing us to predict the movements and impacts of objects contained
within. The improving performance of computers has lead to more realistic simulations of physics
obtainable on home computers.
Physics-Based Simulations are used for everything from predicting the motion of particles in
fluids to adding realistic motion to computer animation. Simulators are concerned with predicting
the kinematics of objects for every step of the simulation. To do this, they must calculate all
the forces obtained from constraints (restrictions of motion) and resulting forces from collision
detection (see section 2.3) with other objects. In this model every object has the potential to
interact with every other object and in a crude implementation this can result in n(n−1)
pairs
2
of collisions for n objects, potentially an O(n2 ) problem. Simulators have the complicated job
of balancing accuracy and efficiency. They make sure the models they use are as believable as
possible and that they are able to calculate the results quickly enough. Problems occur when the
models used require more time to compute than is available. The extent of this depends on the
context of the simulator. Recorded animations using physics are far more interested in accuracy
because realism and resolution is the motivation. Simulators can generally be categorised by the
following properties:
• Off-line or Real-time
• Scripted or Interactive
Off-line simulators are simpler to deal with. The lack of deadlines for calculations means that
greater detail can be used to model the real-world, at the cost calculating the predicted motion.
Movie production animation is an example of an off-line simulator. Off-line simulators are usually
scripted too. The animators describe the motion and it is the job of the simulator to make
16
CHAPTER 1. INTRODUCTION
17
the animation believable. Real-time simulators are the main area of focus for performance
improvements. Techniques such as using Axis-Aligned Bounding Boxes (AABBs) with a sweep
and prune method (described in section 2.3.1.3) are used improve performance by reducing the
order of collision detection pairs to a worst case lower bound of Ω(nlogn). To make matters worse
real-time simulators are often interactive, which means they are affected by non-deterministic
input. Interactions can affect the complexity of the simulation at any point in time. Consider
the following situation:
“A tower of objects is stacked one object on top of another. The only contact resolution for each object is the equal forces applied for each neighbouring object above and
below. A user decides to remove an object from the middle of the tower. Depending
on the force and direction applied to move the object, all other objects have the
potential to collide with each other whilst falling. The physics of interactions must
be solved before the next step in the simulation. To make the action responsive, it
1
1
and 30
seconds1 ”
must be completed in between 60
Figure 1.1: Left: A stable tower of simulated blocks. Right: Disrupting the tower.
This example outlines a typical scenario for a real-time physics simulator. Movements and
interactions can be sudden and unpredictable, making it difficult to identify situations when
complicated collisions occur. In general, the efficiency and workload management is a well
established area, but there is less focus on sporadic cases. By this we mean cases in physics
simulators that are rare, but have a large impact on the system when encountered. A large
group of objects converging on a single point and colliding is such an example of a sporadic case.
For the benefit of the report, we will call this case “The Collision Overload Problem”.
1.1
The Collision Overload Problem
The motivation for this project arose from an observation of a game running on a home computer.
The scenario is as follows:
1 Figures quoted reflect a visual frame rate of between 30 to 60 frames per second, a suitable rate for realistic
interaction.
CHAPTER 1. INTRODUCTION
18
“In the game, the objective is move through the levels collecting items and fending
off enemies. The game takes either a first-person or third-person perspective and
exhibits a world using the latest graphical techniques. The unique concept of the
game is that unlike most shooting games, you have no weapons or explosive devices,
instead your character has the unique telekinetic abilities to pick up objects and using
their power of the mind launch the objects at a selected target.”
Being renound for pushing the boundaries of what is feasible in game play, the gamer attempted
to use all objects in the local area at a single target. The result was a significant drop in framerate as the objects converged on the target followed by crunching sounds as they collided and
eventually dispersed. The frame-rate returned to a regular interval and the game was at the
same performance as before the collision. To clarify, this is situation that occurred:
“By directing a collection of n objects at a single point and launching them with a velocity, the frame performance dropped as the objects began to converge and was at its
lowest at the point where they began colliding. As they left the collisions in different
directions, the frame performance returned to same as before the interaction.”
Developers make it their goal that a user’s experience is a smooth and fluid as possible and they
ensure that the detail they provide in games is suitable for recommended machines used to run
them. Generally gamers are very quick to spot situations where performance drops and are often
critical of low frame-rates and disruptions to play. They are therefore tied in a trade-off between
performance and accuracy.
The goal of the project:
“We want to be able to reduce the impact of sporadic cases, like sudden convergence, on the performance of the system, thus allowing more
complex physical representations to be used.”
1.2
Approach
The approach taken in this report was to research the current techniques used in physics simulations and locate an area where improvements could be made. After finding a suitable area from
the background and investigation the aim was to implement the technique in a framework, refered to as “Scatter ”, in this report. Using quantitative and qualitative data from the running the
implementation in a scenario similar to the collision overload problem, the aim was to evaluate
the improvements in comparison to the same scenario without the improvement. The approach
can be broken into the following stages:
Physics simulator research - Finding existing physics simulators, understanding the structure and techniques used. Reading documentation and research papers describing how the
simulators are used and observing examples utilising the engines.
Implementing a working demonstration - Experiment with a physics engine and combine
it with a renderer to give insight how a framework needs to be designed. Aim to identify
the difficulties of doing so and take requirements for the framework design.
CHAPTER 1. INTRODUCTION
Figure 1.2: The Collision Overload Problem exhibited in a game
19
CHAPTER 1. INTRODUCTION
20
Physics Framework - Design and build a framework based on the requirements and improvements from the demonstration tool. Ensure the framework is able to record and output
quantitative data. Select the appropriate tools needed to achieve the requirements.
Recreate the problem in the framework - Create a scenario in the framework that reflects
the original problem. Aim to simulate a game environment, but with additional controls
to identify where problems exist. Examples include the ability to pause the simulation or
visual output to identify instances of collision.
Identify a solution to solve the problem - Find a topic that is related to the motivation
scenario and aim to expand on it.
Implement solution in framework - Modify a version of the framework that is able to perform the solution. It must be clear which components belong exclusively to the solution to
evaluate the framework with and without it.
Compare quantitative output of framework - Run the scenario in both frameworks and
compare the output data. Record the values in graphs and charts and identify where the
solution is working and whether it improves the performance.
Discuss visual output of framework - In relation to the techniques used, discuss whether
there are noticeable changes when running the system. Analyse the aspects a user may
observe when viewing the simulation.
1.3
Contributions
From the research performed in this project I have made the following contributions:
An Investigation into the aspects that affect simulator performance - I have outlined
two major areas of research to improve the problem seen in collision overload: Parallelisation of calculations and level of detail. I have focused on the area of level of detail, explicitly
“dynamic model reduction” and drawn comparisons with model reduction techniques from
graphics. I have outlined how they could be applied to physics simulators.
Designed framework for investigating performance - From the initial research I have designed and implemented a framework, “Scatter ”, for the purposes of prototyping and testing
physics techniques. The framework mimics game engine design and records performance
data for analysis. It allows scenarios to be played under different conditions to further
investigate level of detail.
Proposed and implemented a technique for dynamic model reduction called
“Model Switching using Encapsulation Levels” - Stemming from the investigation
into level of detail, I have proposed a technique to address the motivation scenario. I
discuss the success of the concept and the areas of potential application.
Investigation into a system for requesting performance improvements - I have discussed
using policies to preemptively trigger performance improvements for physics systems. I
have investigated local policies such as proximity and used them in an implementation of
dynamic model reduction.
CHAPTER 1. INTRODUCTION
21
Expanded on physics simulator analysis - By looking at physics simulators using with a
modular perspective, I have used an existing physics simulator called “Bullet ” as a case
study for applying research techniques. By implementing a solution of my technique in
Scatter ( which runs using Bullet ), I have discussed the hurdles and commented on the
suitability of using an existing simulator as a learning tool.
Chapter 2
Background
“Move, collide, resolve, repeat...”
Ancient Physics Proverb
The aim of the background is to touch on many of the techniques required to understand the
workings of a physics simulator. As this area of real-time physics is very specialised, the content
provided is intended as a “point of reference” for further reading. The following materials provide
details of algorithms covered by this chapter. “Game Physics” by David Eberly ranges from
fundamental physics to building a physics engine [17]. Although written with games in mind,
the comprehensive appendix of linear algebra ensure that the information is suitable for anyone
with an interest in building a physics simulator. Other good resources include a book from the
same series called Collision Detection in Interactive 3D Environments [45]. The book by Gino
van den Bergen describes algorithms used in collision detection, an area commonly recognised as
being the bottleneck of physics simulations. The concluding section Report Terminology (2.11)
of the background is reference for the commonly used terms throughout the report.
2.1
Physics System Terminology
2.1.1
Basics of Physics
Starting from basics, motion in physics is described by the following ordinary differential equation
(ODE):
v(t)
x(t)
ω(t)R(t)
d R(t)
dt P (t) =
F (t)
τ (t)
L(t)
Location
- x(t) - Where objects and particles are located in a world coordinate system.
Velocity
- v(t) - The rate of change of displacement of particles with components in threedimensions.
Mass
- M - The mass of the particle/body.
Orientation - R(t) - The direction objects and particles face in a world coordinate system.
24
CHAPTER 2. BACKGROUND
25
Linear and Angular Momentum - P (t) and L(t)- The momentum of motion relative to the world
and around the centre of mass of an object.
Torque
- τ (t) - Force dependent on the centre of mass.
Force
- F (t) - Internal and external forces from fields, gravity and contact with other
objects.
Inertia Tensor - I(t) - The distribution of mass in a body relative to the centre of mass (CofM).
Physics simulators use either forward or inverse kinematics and dynamics to solve motion. We
can think of forward as “I am at time t and need to go forward to get to time t+dt” and backwards
as“I am going to finish as time t + dt and need to go backwards to get to t”. Inverse dynamics is
noted as being an easier problem to solve [18], which is why the simulators in this report refer
to “Integration Transformation” as a last step in calculating final motion. David Baraff, Andrew
Witkin and Michael Kass ran a course on “Physically Based Modelling” at SIGGRAPH, most
recently in 2003[23]1 . The course had been run as far back as 1995, based on content from papers
they had published respectively (many of which are the basis for most modern physics engines).
Reading the course notes or publications with similar content (Eberly’s “Game Physics”[17]) will
give a practical understanding of how the mechanics can work together. The course covers the
relevant basis in differential equations and particle dynamics to understand most of the papers
referred to in this report.
2.1.2
Newtonian vs. Lagrangian Dynamics
Dynamics is the area of physics that describes how particles move when external forces act upon
them. This covers Newton’s second law F = ma where F is the applied force, m is the mass
of the particle and a is the acceleration. Newtonian dynamics describe combined external and
constraining forces working on objects. When using Newtonian laws, F includes the constraining
forces like friction and contact forces. Eberly [17] suggests that although this makes Newtonian
dynamics appropriate for general-purpose physics engines, the difficulties arise in modelling friction effectively. He notes that Lagrangian dynamics are more suited to frictional forces because
the equations can be formed in a way that removes the constraining forces. Lagrangian dynamics
use energy in their formulation, transferring between potential and kinetic. The choice of dynamics system to use is based on purpose of the physics engine. Certain dynamics are better suited
to modelling certain characteristics of physics, for example, Euler’s equations of motion better
represent axis rotation than kinematics. Eberly shows preference in using Lagrangian dynamics.
He mentions that although it takes additional programming time to construct a complete system,
using Lagrangian dynamics is more stable and efficient.
2.1.3
Rigid-Bodies vs. Deformable-Bodies
A rigid-body is a region that has mass and dimension. The three basic examples of rigid-bodies
are a single particle, a particle system and a continuum mass. The assumption made about
rigid-bodies is that the particles that compose the body do not move relative to each other.
This concept lends itself very nicely to modelling objects in simple physics systems. The visual
representation of objects in the graphics world is performed using vertices, which connect to
make polygons. A simple box constructed with eight vertices and six faces can be represented
by a similar system of eight particles in the physics world. The human perception that a box
1 The
reference material available is from the 2001 course.
26
CHAPTER 2. BACKGROUND
will not melt, collapse or inflate allows us to assume that the particles that make up the corners
of the box will not move relative to each other and therefore form a rigid-body.
A deformable-body on the other hand, is a body that can change shape or volume when an
external force is applied to it. The particles that compose the body are able to move relative to
each other. This adds additional complexity to the physics calculations, which have to take into
account the distances between the composing particles. Deformable-bodies have can be modelled
using Finite Element Methods (FEM) to approximate the deformation function. This method
has been done in real-time to simulate fractures of stiff materials [39]. It has been suggested that
this method is difficult due to the following reasons:
1. The time-step for dynamic integration has to be reduced to simulate collisions.
2. The size of the problem is an order of magnitude higher than the two-dimensional problem.
Using FEM produces accurate results and techniques have improved to perform real-time deformation [10]. Other methods of modelling deformable bodies include using mass-spring models.
They are easier to compute and the cost of the calculations is less expensive [19].
The choice of body is dependent on the application for which it is required. Deformable bodies
are required to simulate cloth, skin, plastics and breaking objects. They are very much suited
to the animation of characters wearing clothes, rendering realistic looking flags and simulating
damage on vehicles due to collision. A combination of rigid bodies and deformable bodies can
occur in the same system, for example a solid table covered in cloth. Bridson, Fedkiw and
Anderson demonstrated using both deformable mass-springs and rigid bodies to model cloth at
certain stages of collision known as “impact zones” [2]. The method was proposed by Provot who
observed that in bunched areas of cloth, friction restricts relative motion [35].
Rigid-bodies are a much simpler area to focus on. Deformable bodies could experience the
same collision overload problem, but this project discusses techniques relative to rigid-bodies.
For more examples of real-time deformable bodies see the “Real Matter” demonstration [38].
2.2
Physics Concepts
The exact mathematical details of how to model velocities, forces and momentum of interactions
are outside the scope of this report and can be reference independently. Instead the aim is to
highlight the mathematical techniques regularly referred to by physics engine designers, with the
purpose of understanding what types of calculations they solve.
2.2.1
Linear Complementarity Problem (LCP)
The term Linear Complementarity Problem (LCP) crops up regularly in discussions about Constraint Solvers (see 2.7.1.4). LCP solvers are software implementations that can solve LCP
problems, usually for collision response or more specifically contact forces (see 2.4). An LCP
refers to a general problem in linear algebra that attempts to find values for the column vectors
w and z for the following conditions2 :
Q
=
w − Mz
and w ≥ 0
and z ≥ 0 and w · z = 0
Where Q is a known n-dimensional column vector and M is a known n × n dimensional
matrix.
2 Further
information on LCPs can be found in[13]
27
CHAPTER 2. BACKGROUND
Knowledge of LCPs is relevant to this project since collision response methods described in
this report show varying speed and accuracy when implemented using LCP solvers. Rewriting or
designing an LCP solver may be beyond the scope of the project; however selecting an appropriate
implementation is an improvement.
One possible algorithm for solving an LCP is the “Lemke-Howson Algorithm”. This algorithm
can solve LCPs for non-trivial solutions. The algorithm is often referred to as a pivoting method.
Pivoting methods use a finite number of steps and require a recursive solution [26]. Another
pivoting method is Dantzig’s algorithm described in Baraff’s paper in 1994 [5]. These methods
are considered accurate, but not as desirable for real-time physics. The Projected Gauss-Seidel
method works by improving the result every iteration. Pivoting methods can sometimes fail to
produce results or suffer from rounding errors. Iterative methods are usually preferred because
of their ability to produce close results when interrupted early [26].
2.2.2
The Gilbert-Johnson-Keerthi Distance Algorithm (GJK)
The Gilbert-Johnson-Keerthi Algorithm (GJK) is a method of calculating the distance between
convex objects [24]. It is used in systems that have relaxed rules on non-penetrating objects.
This means penetrations are allowed, with the intention of finding the point of contact (described
in 2.3.2). The new object positions are solved for time t2 , then GJK is used to calculate the
amount of inter-penetration. The movements of the objects are then reversed to the time of
contact, t1 , where t2 > t1 . Van den Bergen describes using an improved implementation of GJK
for use in collision detection techniques [7]. It is utilised due to its simplicity to implement and
applicability to convex objects such as boxes, spheres, cylinders, convex hulls and Minkowski
sums3 of convex objects. The GJK method can calculate distances in the following form.
Given a distance d(A, B) between two objects:
d(A, B) = min{|x − y| : x ∈ A, y ∈ B}
Where x is a point on A and y is a point on B. Considering the points a and b with the shortest
distance between them:
d(A, B) = |a − b|
The Minkowski sum A − B can be written as:
d(A, B) = |v(A − B)|
The GJK can iteratively calculate vk = v(A, B) in k iterations where k is a finite number, to
produce the distance between the objects4 .
3 A Minkowski sum is the addition of two polygons by considering all possible sums of a points of one polygon
and a point of the other.
4 See [24] for a full proof.
CHAPTER 2. BACKGROUND
28
Figure 2.1: A diagram of the Minkowski sum of a circle and a square
2.3
Collision Detection
Collision Detection is considered such an important aspect of physics simulations that software
libraries have been constructed to perform collision detection alone. In the context of this paper
collision detection refers to the process of comparing two rigid-bodies, detecting whether they
penetrate or will penetrate and then calculating the exact points of contact. For contact between
n objects, it would seem the calculations required are of the order O(n2 ). This is the number of
“collision pairs” (a pair of objects that require collision testing). The number of tests is clearly
too high for large real-time systems, so collision detection software in general breaks down the
calculation into two basic phases: a broad-phase and a narrow-phase. It is the job of the broadphase to reduce the number of intersections required by using various culling techniques [18]. The
narrow-phase then identifies which pairs of objects are intersecting and in another step calculates
the point of intersection.
2.3.1
Broad-phase
The broad-phase usually uses a number of bounding primitive techniques to calculate intersections quicker. If a bounding primitive pair is found to intersect then the pair is added to a set that
will be tested by the narrow-phase. If the bounding pair doesn’t intersect then the objects don’t
intersect. Calculating whether two complex objects are intersecting is far more computationally
expensive than calculating the bounding pair, which is why this phase is so effective. The three
most common types of bounding primitives are bounding spheres, axis-aligned bounding boxes
(AABB) and oriented bounding boxes (OBB).
Figure 2.2: Three examples of bounding volumes: Spheres, OBBs and AABBs
CHAPTER 2. BACKGROUND
2.3.1.1
29
Bounding Spheres
Culling with bounded spheres is based on the concept that spheres don’t overlap if the distance
between their centres is larger than the sum of their radii, shown in the following inequality:
|C2 − C1 | > r2 + r1
or |C2 − C1 |2 > (r2 + r1 )2 (to avoid having to calculate the square root)
Although the calculation is simple, it still requires a comparison of all n pairs of objects. The
culling doesn’t take into account the knowledge of location (spatial coherence) or the predicable
small changes of distance of short periods of time (temporal coherence) [17].
2.3.1.2
Oriented Bounding Boxes (OBB)
OBBs represent the orientation of the enclosed object as well as a more accurate approximation
of their volume. The oriented bounding boxes benefit from the fact that they are symmetric.
This reduces the number of tests they need to perform when comparing two boxes (15 instead
of a worse case scenario of 156! 5 ).
2.3.1.3
Axis-Aligned Bounding Boxes (AABB)
AABBs use an efficient “sweep and prune” algorithm along the aligned axis to compute the set
of intersecting pairs. They also take into account temporal and spatial coherence when updating
the bounding boxes. More details of the algorithm can be found in Eberly’s book[17]. AABBs
were previously shown to be slower than OBBs [28], but following work on the improvement of
overlap testing by Gino van den Bergen [6, 45], they have been shown to be a very appropriate
method for deformations and for rigid bodies as well. Most current physics systems implement
AABBs because of their simplicity and effective culling.
2.3.2
Narrow-phase
To find an intersection in the narrow-phase, a method using a separating plane as a “witness”
can be utilised. Baraff describes the technique in which a plane between two objects is calculated
(the witness) [23]. If such a plane doesn’t exist then the objects are intersecting. The cost of
calculating this plane is negligible since the small changes in each step of the physics system
usually result in the same plane separating the objects. The contact points can only occur on
the separating plane and hence only those in coincident with the plane should be compared for
both polyhedra. If the objects are penetrating then the simulator reverses to find the time at
which the contact did occur and then calculate the contact points at time t (as described in
GJK).
2.3.3
Continuous/Discrete Collision Detection (CCD & DCD)
The underlying principle of discrete collision detection is the concept of a time-step. A time-step
is the period of time the physics world waits before the new state of the world is calculated. This
can be compared to the idea of checking a clock. When you first observe the clock you know
where the hands are pointing, then look away for a period of time (time-step). You will only
know the position of the hands on the clock when you observe it again. The period between
looking is the time-step. In physics simulations this period is usually fixed and is related to the
detection of collisions.
5 Figures
from Eberly [17]
30
CHAPTER 2. BACKGROUND
The method of stepping through the physics simulation and reversing back if a penetration
has occurred is known as discrete collision detection. Discrete collision detection can “miss”
detecting collisions if the time-step is too large (known as the “tunnel effect” because objects can
’tunnel’ through other objects). Continuous collision detection works on the basis of calculating
the time of impact (TOI). This is done during the collision detection and can apply to both the
broad and narrow phases. Redon et al. suggest a method for fast continuous collision detection
using OBBs for large polyhedrals (tens of thousands of triangles).
CCD is becoming a more popular method because it doesn’t suffer from the tunnel effect. It
is used in games for physics that need to be reliable, for example, story dependent interactions.
2.4
Collision Response
Collision Response is the second part of the full collision detection system. Once the contact
points have been computed the constraints between them need to be resolved. This section
deals only with the objects that are in contact. The other objects not involved in contact do
not require this step and can hence be ignored. In general contacts can be separated into four
categories:
• vertex/face
• edge/edge
• vertex/vertex
• vertex/edge
Vertex/vertex and vertex/edge are degenerate and are not usually considered due to their unlikely
nature. Another assumption is that edge/edge collisions are not collinear.
2.4.1
Resolving Contact
To find out whether or not a body is colliding, resting or separating. Baraff suggests we need to
consider the velocities of the contact points [23]. Consider a vertex/face contact:
Given p˙a (t0 ) = va (t0 ) + ωa (t0 ) × (pa (t0 ) − xa (t0 ))
the velocity of vertex point a
and
Given p˙b (t0 ) = vb (t0 ) + ωb (t0 ) × (pb (t0 ) − xb (t0 ))
the velocity of vertex point b
Then the relative velocity is:
vrel = n̂(t0 ) · (p˙a (t0 ) − p˙b (t0 ))
1. If p˙a (t0 ) − p˙b (t0 ) is in a positive n̂(t0 ) direction, the contact is separating .
2. If p˙a (t0 ) − p˙b (t0 ) is in a negative n̂(t0 ) direction, the contact is colliding .
3. If p˙a (t0 ) − p˙b (t0 ) is perpendicular to n̂(t0 ), the bodies are resting .
CHAPTER 2. BACKGROUND
31
The forces involved are time dependent so in general impulses are used. The results are different
for colliding and resting contact because resting contacts need to be balanced by all forces acting
upon them. It is this stage in collision contacts that restitution is used. A restitution value of
ǫ = 1 would result in completely elastic collision. Once resolved, the objects have new velocities
(or are resting if ǫ = 0).
Resting contacts are the most difficult problem in Baraff’s dynamics notes [23]. Numerical
methods are required to calculate the impulses of all the objects acting on one another. The
section is too long to expand upon in this paper, but Baraff’s notes explain the process. There
are a number of approaches to resolving the contact impulses, some of which are mentioned in
the physics simulator section of this background (see 2.7.1.4).
2.5
Simulation Loops
The simulation loop is the part of physics-based program that controls the interactions between,
input, updating and output of the simulation. It is arranged as a sequence of tasks that must
be performed for each frame. Input consists of collecting information from hardware such as
mice, keyboards and other controllable devices and converting it to actions that are meaningful
to the simulator: holding down keys could correspond to applying forces to objects in the world.
This information is collected by either polling the device or responding to events. The resulting
actions of the input are carried out in the update stage of the loop. Updating involves gathering
information from the previous frames and combining it with new input to calculate the next
state of the simulator. The first stage of updating is usually to decide what the new state of the
simulator will be, then the second stage is to perform the simulation for the frame resulting in
new positions for objects, camera orientations and interaction resolution, ready for the output
stage. The output stage is the section where results are rendered, including but not limited to,
visual data, audible data, file data, and communication data.
Simulation loops can be grouped by deterministic behaviour. Simulations that focus on
internal interactions are usually less concerned with input from outside the simulation world.
Such an example is “Go Fish!” a physics-based simulation of a virtual marine world that models
the natural movement of fish along with behavioural and perceptual simulation[42]. The purpose
of the simulator is realistic animation with a minimal amount of scripting. The scripted motions
of a fishing line and camera are used as the basis to show the simulator animating a fish being
attracted by the bait and subsequently caught. This simulation and other similar animation
simulations are focused on filling the gaps between animation frames, a utilisation of physics
simulators. As a result, the computational time of each frame is less crucial because the simulator
doesn’t have to appear responsive to user interaction. If the aim however, is to provide real-time
animation, then frame generation time is relevant. Other simulators that rely less on nondeterministic input include predictive simulations of weather systems and particle interactions.
The non-determinism of video games provide some of the most complicated simulation loops.
They must deal with user input, networking, physics, AI, game logic, user interfaces and graphical
outputs at a rate that feels responsive to the user. Architecting “game loops” is a complex task
mainly because of the number of interactions between the different sub components of the game
(see figure 2.3). Each component is optimised to provide the best performance for each of its
tasks every frame.
32
CHAPTER 2. BACKGROUND
Communication
Timer
AI
Game Logic
Physics
Scene Manager
Scripting
Collision
Detection
Frame Manager
Sound Manager
Animation
Input
Sound Renderer
Renderer
Figure 2.3: The interactions of modern game loop
2.6
Game Loops
Physics is a much smaller subsystem of the game loop. With all the modules working concurrently
and interacting every frame is hardly surprising that the time available for calculations is limited.
Despite the many tricks used in physics, graphics and AI to improve frame performance there
still exist situations where the time taken to perform the desired work is longer than acceptable
for consistent frame-rate. Generally it is up to the architect of the game engine to decide how
to deal with with this problem. In most cases this is manifested in a drop in frame-rate or less
frequent physics updates. There is a point at which this boundary can no longer be pushed and
the simulation has a lower bound for producing a single frame. Looking at game loops helps
to understand how physics modules are required to function. Although abundant in the games
industry, there are few academic works addressing game loops. Valente et al. [44] attempted to
apply classification to loop design. They categorised loops into:
• Simple Coupled Model
• Synchronised Coupled Model
• Single-Thread Uncoupled Model
• Multi-Thread Uncoupled Model
• Fixed Frequency Uncoupled Model
Coupled models in the classification scheme are those which are update (input, physics) and presentation (rendering) are dependent. Uncoupled models update independently. We will see later
that Scatter uses a single-thread uncoupled model. This choice is mainly due to the simplicity
for the purpose of demonstration.
Game loops have traditionally been single threaded, but more multi-threaded games engines
are being produced to take advantage of the multiple cores of next generation consoles. Producing
CHAPTER 2. BACKGROUND
33
multi-threaded loops is a non-trivial problem for the games industry. It is important in this report
to understand simulation and game loops and how they update and render data. I therefore
provide the following references for further information:
• An article on Multi-threaded Game Engine Design by Tulip et al. [43].
• An Intel article on game loops and optimisation [32].
2.7
Real-time Physics Simulator Design
This section of the report focuses on describing a real-time physics simulator. A real-time
physics engine (or physics simulator) is an application or set of libraries that can be called by a
simulation loop to calculate physical interactions and collision detection by simulating a period
1 th
of a second. The methods used in real-time
of time. The time periods are usually around 60
systems are optimised and as a result, are a trade-off between speed and accuracy. To understand
the bottlenecks and where further optimisations can occur requires analysis to determine where
the numerical calculations are performed relative to the physics loop. We will discuss this in
section 3.1.
Erleben’s thesis on Stable Robust and Versatile Dynamics Animations has a detailed description of the modules that make up rigid-body simulators [18]. His thesis looks at the internal
structure of the open dynamics engine (ODE) (see 2.8.2.1) and how it compares to a modular
design. The most relevant section of Erleben’s thesis is the mention of the spatial-temporal
coherence analysis module (STC). This area is of most interest to this project because it is the
section of the engine loop that provides the appropriate data for optimising. The thesis is a
must-read paper because it describes in great detail many of the topics mentioned throughout
this report [18].
2.7.1
Physics Time-step In Detail
The pseudo code in figure (2.4) describes how an internal time-step is calculated in a physics
engine. The pseudo code is based on the Bullet Physics SDK [14]. This particular example is
abstracted to make it a more general description of the sequence. For the benefit of simplicity we
assume the physics engine uses fixed time-stepping (written as “FixedTimestep” in pseudo code).
2.7.1.1
Unconstrained Motion
The time-step starts by applying the equations of unconstrained motion. This refers to rigidbodies that are not constrained by connections with other bodies and that are assumed to not be
in contact with other bodies. This step acts as a prediction for the motion of the objects. In the
case of contact, the calculated velocities would not be the same if a collision occurred and hence
why the value is a prediction. In this case the unconstrained movement would be disregarded.
The term “Global fields” refers to forces in the world such as wind, water flow and attraction,
for example magnetic. These are forces that would act on the natural motion of an object and
can be considered as influencing the object’s path of unconstrained motion. The dampening in
the update of linear and angular velocity takes into account the restrictive forces of travelling
through volumes.
CHAPTER 2. BACKGROUND
2.7.1.2
34
Collision Detection
After calculating unconstrained motion the engine loop needs to determine the objects that
are involved in collisions. This section is an important aspect in reducing the processing time
required by the physics engine. The costly calculations of the narrow-phase can be reduced by
an efficient broad-phase that eliminates non-colliding bounding boxes.
The Bullet User Manual discusses the concept of a mid-phase for further culling and complex
collision shapes like concave triangle meshes. It uses a hierarchical bounding volume structure
with optimised traversing to find the narrow-phase components [14]. In the case of continuous
collision detection, the time of impact is calculated in this step.
2.7.1.3
Non-Contact Constrained Motion
Once the contacting pairs have been identified the non-contact constrained motion can be calculated. This is motion of objects that are constrained by other objects, yet do not collide with
them. An example of this is a rag-doll model. Each limb is constrained by the others, but
the limbs aren’t in contact. Eliminating this group of objects removes them from the collision
response set that is calculated in the next step. The physics engine looks at every constraint
in the world and solves the linear and angular velocities for each respective constraint. The
pseudo code in figure (2.4) shows a few examples such as, a hinge contact and a point contact
between objects. The type of constraint and complexity is dependent on the implementation of
the engine.
2.7.1.4
Collision Response: Contact Constraints
The response step is to calculate the impulses acting on the rigid bodies. This can be done
using the sequential impulse-based method described by Mirtich in his PhD thesis [Mir96][30].
It describes how modelling interactions between bodies is performed through collisions instead
of by computing constraint forces at contact points.
It is also possible to solve the collision responses using an equivalent LCP. For real-time
physics, an iterative LCP such as the “Projected Gauss Siedel” (PGS) method would be appropriate [11]. Coumans notes that direct LCP methods such as Dantzig LCP can provide better
quality solutions [15].
2.7.1.5
Integrators
The integrator is the step of the physics engine that actually produces the movement. Once the
velocities are calculated, then the integrator estimates the new position of the object over the
time-step. It does this by solving a series of differential equations of motion.
2.7.2
Modular Design
The previous section described the stages of solving physics as functions. Erleben’s modular
structure breaks physics simulators down into the following four components:
• Time Control Module - Controls when an how all the other modules are invoked in the
simulator. It is also in control of the time-stepping algorithms to decide the frequency of
moving through the simulation.
CHAPTER 2. BACKGROUND
Figure 2.4: Pseudo Code: Simplified Physics Time-step
InternalTimestep(FixedTimestep) {
//FixedTimestep is the amount of time the simulator will step through
//Unconstrained Motion
Apply Global Fields
Apply Gravity
UpdateLinearVelocity(volume Damping, Linear Damping)
UpdateAngularVelocity(volumeDampening, Linear Damping)
//Collision Detection
//Broad-phase
UpdateBoundingVolumes()
CalculateOverlapping()
//Narrow-phase
processCollisionDetection(algorithm)
//Non-Contact Constrained Motion
For each (constraint) {
switch(constraint.type) {
case hinge {
SolveLinear()
SolveAngular()
}
case point {
SolveLinear()
SolveAngular()
}
..
etc
..
} //End switch
} //End For each
//Collision Response (For Contact Constraints)
For each (iteration over contact constraints) {
For each (contact point) {
CalculateImpulses( body1, body2, contact point )
}
}
For each (iteration over contact constraints) {
For each (contact point) {
CalculateFriction( body1, body2, contact point )
}
}
IntegrateTransforms(Timestep)
} //End PerformTimestep
35
CHAPTER 2. BACKGROUND
36
• Motion Solver Module - Responsible for moving the objects in the simulation. Motion
solvers often predict motion before collision detection, then integrate to calculate the new
positions.
• Constraint Solver Module - This module deals with the forces on objects and resolves them,
passing the output to the motion solver module.
• Collision Solver Module - Computes the impulses and applies them to the colliding pair.
The module is usually concerned with “contact resolution” where the impulses applied to
an object from a collision are resolved. It deals with discontinuous motion and informs the
motion solver module.
2.8
Existing Physics Technologies
Many of the topics covered so far have been around since the 1980’s. Baraff’s 1989 paper on
Analytical Methods for Dynamic Simulation of Non-penetrating Rigid Bodies [4] is regularly
referenced by a large number of papers since it was published. The mathematical techniques
date back further, but real-time physics has only become popular following the introduction of
machines that can compute the algorithms sufficiently.
In the games industry this step is most noticeable in the arrival of a series of middleware
technologies for game developers, animators and researchers alike. This section gives a brief
description of the most prominent runners in the race to provide a complete physics solution.
2.8.1
Commercial
2.8.1.1
Havok Physics
Havok is probably the best known physics middleware in the games industry. Their technology,
Havok Physics and Havok FX has been used in a large number of game titles to date and extends
beyond games, providing solutions for modelling and other applications. Havok Physics provides
collision detection, vehicle dynamics, compact representation of large meshes data serializing
and constraint solving. Havok have recently been working with graphics card manufacturers to
provide physics solutions for the GPU. Although best known for physics, Havok provide a family
of products for games.
2.8.1.2
AGEIA PhysX (Formally Novodex)
AGEIA PhysX has recently become a prominent name in the games industry because of their
introduction of the physics processing unit (PPU). Their approach to improving physics in games
is to utilise the physics hardware to allow developers to add more physics objects to their games.
They intend to bridge the gap between real-time rigid-bodies, deformable-bodies and particles
to provide cloth, liquids and particles that interact with world objects.
2.8.2
Open Source
2.8.2.1
Open Dynamics Engine
The Open Dynamics Engine is an open source library for providing rigid-body dynamics. It is
used in games and simulations because of its stability, features and platform independence. It is
implemented in C/C++ and is used in a wide range of projects, from robot simulators to vehicle
CHAPTER 2. BACKGROUND
37
modelling.
The ODE has the following basic features:
• Articulated rigid-body structures.
• Hard contacts (Non-penetrating constraints).
• Collision detection.
• Joint types such as: ball-and-socket, hinge, slider, angular Motor, etc.
• Collision primitives: sphere, box, cylinder, plane, ray, triangular mesh.
• Motion equations: Lagrange multiplier velocity based on Trinkle/Stewart and Anitescu/Potra
(Discussed briefly in [18]).
• Choice of time stepping methods (see 2.7.1).
• Contact and friction using Dantzig LCP solver.
The ODE is a very suitable choice for an engine because it has been developed since 2001. It
has the bonus that it is open source under the BSD license or GPL license.
2.8.2.2
Bullet Physics Library
The Bullet Physics Library is a collision detection and rigid-body dynamics library for games
and animations. It is an open source library under the Zlib license and was created by Erin
Coumans [14]. It supports the COLLADA API, a standard for describing physics scenes, and
is integrated into Blender3D, an open source graphics creation utility. Written in C/C++, the
SDK can be compiled on various systems. Bullet has some of the following features:
• Continuous and discrete collision detection.
• Vehicle dynamics.
• Uses GJK algorithm.
• Uses impulse-based methods for collision response.
• Uses AABB bounding boxes in the broad-phase.
• Provides support for convex/concave meshes
• Provides 6 degrees of freedom for constraints: hinge, etc
• Has a broad, mid and narrow phase to collision detection
The Bullet Physic Library is very well supported by the community. Many of the authors of
various papers referenced by this report are regular contributors to discussions on the physics
simulation forum. Like ODE, Bullet is another suitable choice for an engine because it is well
documented and uses recent techniques (Impulse-based collision response, see 2.7.1.4).
CHAPTER 2. BACKGROUND
2.9
38
Level of Detail for Physics
We will now move on to looking at the background for investigating physics improvements. Level
of detail is already commonly used in graphics and gaming. Player models in real-time strategy
games often have varying complexity dependent on the viewing distance from the camera. Techniques for changing the level of detail were described by Martin Reddy at SIGGRAPH 2002 [37].
Other information useful for research into level of detail include Reddy’s PhD Thesis on Perceptually Modulated Level of Detail for Virtual Environments [36] and the book Level of Detail for
3D Graphics [22]. Games developers already make choices for the physical level of detail they
simulate. Using too many complex models can cause performance problems, an aspect that this
report aims to resolve. There have been a few papers on physics level of detail. O’ Sullivan and
Dingliana wrote a paper describing “Collisions and perception”, an important field for deciding
how a user’s perception of physics can allow for performance improvements using level of detail
[33]. Berka wrote a thesis describing level of detail relating to motion and animation in Virtual
Reality. His work relates directly to work by Reddy and is a good basis for further research of
level of motion detail. Level of detail is discussed later in more detail in Chapter 3 of this report.
2.10
Hardware for Physics
In the past hardware manufacturers have produced architectures to suit specific types of calculation. In the race to optimise physics calculations for real-time, hardware solutions have
become of particular interest. They mainly focus on the “brute force” technique of solving the
methods derived from collision detection and response. Many of the required calculations can
be optimised into a single instruction, multiple data (SIMD) form with the final goal of running
the calculations in parallel. The physics community appears to be approaching the problem
from different perspectives. The recent introduction of physics processing units (PPU) has been
spurred on by a demand from the games industry for larger scale, more complex game physics.
The success of the technology is dependent on its efficient utilisation by the developers, algorithm
optimisers and ultimately the ability of the games to provide increased physics complexity while
maintaining real-time response.
At the same-time the introduction of shader programs into the GPU pipeline has opened up
the door to utilising the power of multiple parallel pipelines. These pipelines allow the same
shader function, designed by the developer, to be performed on each vertex or pixel of the
graphics scene in parallel. This makes it suitable for vector calculations that are independent of
each other. Although multi-core CPUs can also be utilised to achieve parallel calculations, this
project focuses on the principle of a constrained CPU and hence assumes that both cores are
under high load when alternate hardware is used.
The technology described below is a reference for the techniques that have been investigated regarding “parallel physics processing”. Although they provide a potential solution to the
problem, this report only covers parallelism briefly. See section 3.3.1 for a description of where
hardware could be used.
2.10.1
GPU
2.10.1.1
Pipeline Architecture
Graphics pipelines on GPUs have the job taking points in 3D space (known as vertices) and
performing the calculations on them to draw the polygons that join them. The GPU finally
calculates how this should be represented on the screen in terms of pixels. The pipeline is a
CHAPTER 2. BACKGROUND
39
specialised unit dealing directly with certain graphics techniques such as textures. On most
recent graphics cards (within the past 3-4 years) there are two programmable stages of the
pipeline: vertex processor and fragment processor. The details of these are describe later on
(see 2.10.1.2 and 2.10.1.3). The pipeline can be visualised as a series calculations that occur to
a stream of vertices in order to produce a frame. Owens describes how the graphics pipeline
can viewed as a programmable stream processor. He discusses how the pipeline can be used for
general-purpose computation on GPUs regularly referred to as (GPGPU). By converting data
into a stream it is possible to use the numerous GPU pipelines to perform parallel computation.
By using the vertex and fragment processors at the relevant steps to manipulate the incoming
data. It is possible to perform certain specific calculations and finally retrieve the result at the
end.
The latest end-user GPUs on the market have 128 streaming processors 6 , while most recent
GPUs have between 12-48 programmable fragment processors and 5-8 vertex processors 7 . The
power of these processing units has the potential to be used for physics calculations described
later in (2.10.1.3 and 2.10.1.2). More details on GPGPU programming can be found in the
GPGPU section of GPU Gems 2 [21].
2.10.1.2
Vertex Processor
The “vertex processor” (often referred to as “vertex shader”) refers to the unit that performs the
calculations on vertex data. A “vertex shader program” is a program that runs on the vertex
processor and can modify the vertex data. Dependent on the architecture of the graphics card,
the processor may be able to retrieve texture data at this stage in the pipeline that can be
written to by the fragment processor later on. This allows vertex shader programs to work on
data passed from previous calculations, potentially useful for result dependent operations.
The vertex processor is used to group vertices and perform operations that can cull and clip
groups. This is of particular interest to physics calculations because it can be used to reduce the
final calculated data set that certain time consuming operations work on and hopefully speed up
the process. The GeForce 6 series architecture could perform operations at 32-bit floating point
precision, so there is still a high level of accuracy available for physics calculations [20]. After
this stage the results are passed to the rasterizer (see [20] for more details), which prepares the
data for the fragment processor.
2.10.1.3
Fragment Processor
The “fragment processor” (often referred to as “pixel shader”) is the unit that takes the fragment
data from the rasterizer and performs calculations on it. In graphics, fragments can be thought of
as “potential pixels” that undergo a series of tests by the fragment processor ultimately resulting
in them becoming the pixel data for a particular pixel on screen [20]. This allows “pixel shader
programs”, the programs that performs such tests, to repeat the same functions on multiple
fragments. Once again, fragment processors are able to access texture data (seen as the memory of
GPGPU computation) and perform hundreds of calculations in SIMD concurrently (one fragment
per fragment processor). In terms of physics this would be useful for ray-casting techniques that
perform operations on a per pixel basis.
6 Data
7 Data
courtesy of Tom’s Hardware [34]
courtesy of Tom’s Hardware [29]
CHAPTER 2. BACKGROUND
2.10.1.4
40
Geometry Shader
Following a new generation of graphics hardware, the pipelines that run on the hardware have
been updated during the development process to match. Direct3D, a Microsoft standard graphics
API is commonly used in many graphics applications to specify what the hardware should render.
The latest specification, Direct3D 10, has a number of features that are too detailed to discuss in
this report8 . Instead we will focus on the important introduction of the Geometry Shader. This
is a new part of the pipeline that sits between the vertex shader and the rasterizer, that allows
programmable operations to operate on the vertex data from the vertex shader. The important
aspects that are now enabled are the abilities to calculate plane equations of grouped vertices
and also to locate the adjacent vertices of the group. This has a potentially useful application in
physics since calculating equations of planes and comparing vertices quickly is utilisable for the
narrow-phase of collision detection (2.3.2). The intent of this section is to highlight the relevance
of this new ability, but due to the scope of the project it may not be possible to investigate
geometry shaders further.
2.10.1.5
CUDA
During the progression of the project, Nvidia released its new framework, CUDA for performing
calculations on the GPU. CUDA takes advantage of the architecture of the 8 series GeForce cards
to provide a higher level of GPU abstraction. We can speculate that physics implementations
on CUDA are already underway so highlighting this development is important in the context of
this report.
2.10.1.6
Appropriate Physics Utilisations
At this point it is probably important to comment on Shader Model 3.0. Shader models are a
standards to define the capability of graphics hardware. Shader Model 3.0 is important because
it has the following features:
• Multiple Render Targets - The output of the fragment shader can sent to four render
targets. This is especially useful for particle physics where location and velocities can be
calculated simultaneously.
• Vertex Texturing - The vertex processor is able to read texture data. This means that
calculations performed on previous data can be passed back to future calculations before
the culling process. In terms of physics, operations that iterate can be performed and
tested using this feature.
The Havok technologies described in section (2.8.1.1) set a minimum standard on Shader Model
3.0 for use on GPUs. It is possible that these features of Shader Model 3.0 are the reason for
setting a standard for their software. It is now more likely that developers will use the new
architecture in the latest GeForce cards as the standard for providing physics.
Much work has been done on GPUs in the area of real-time physics. For his Masters project,
David Knott at the University of British Columbia discussed two methods of using the GPU
pipeline to perform collision interference on hardware [25]. The techniques, particle detection
and ray-casting, were implemented on the vertex and fragment shader respectively.
8 Details
can be found in the following paper [8]
CHAPTER 2. BACKGROUND
2.10.1.7
41
Limitations
The most noticeable limitation to the GPU hardware is the ability to read from the GPU back
to the CPU after the calculations. Some approaches in utilising GPUs account for this by
implementing their most of their code on the GPU.
In order to obtain data from the GPU, the output of the fragment processor must be rendered
to texture. An operation can then be performed to read from the texture memory. The problem
is that when the operation occurs the GPU pipeline is flushed, so this can only be done at the
end of the calculation. This poses a problem for real-time feedback. Knott mentions the specifics
of the problem in his paper [25].
Another limitation that needs to be considered is round-trip time of calculations sent to
the GPU. Buck explains that the time taken to prepare the CPU data, send it to the GPU,
process the data and retrieve it can often be longer then the time taken to perform the same
CPU calculation. [9]. He suggests that in order to make the method effective the number of
operations performed on the GPU must be suitable to account for the cost in sending and reading
back. We can speculate that this limitation will be overcome in the new architecture therefore
making batch processing of generic data easier.
2.10.2
PPU
2.10.2.1
Architecture
The hardware implementation of the PPU has been pioneered by AGEIA (2.8.1.2). Not a large
amount is known about the technology other than its ability to perform large amounts of physics
calculations via the AGEIA PhysX System. AGEIA claim that the unit has 2 terabits per second
of internal read/write memory bandwidth, which is higher that the equivalent GPU bandwidth.
The other details they have revealed in their white paper is that the hardware has multiple cores,
physics specific core architecture and accessible memory [1]. The speculation that this processing
hardware can perform many parallel operations is quite likely.
2.10.2.2
Limitations
The main limitation of using such a piece of hardware is that the internal architecture is not very
well known and tailoring a physics system to utilise it is a problem. The API to the hardware is
provided by the PhysX system itself, so it is specific to the PhysX standard.
The current hardware produced by manufacturers Asus, Dell and BFG work through the
PCI bus, not the current PCI Express ports. This appears to be a major limitation in terms
of throughput; however it is possible that in the future the hardware can use the PCI Express
interface and theoretically increase the bandwidth.
2.10.2.3
Appropriate Physics Utilisations
The PhysX system has been demonstrated providing a range of physics improvements from
rigid-bodies to particles. The general understanding is that it can be used for effects systems
of explosions, flying objects and liquids that interact at real-time with aspects of the game.
The PhysX PPU has been utilised by physics-based games. Cell Factor allows players to use
telekinetic abilities to pickup and throw objects at other players. The demanding physics in the
games require the use of the processor to perform the actions of collecting hundreds of objects
together in a single location and move them around. This is one direction gameplay physics may
take, the extent of which could provide even more based-physics games.
CHAPTER 2. BACKGROUND
2.11
42
Report Terminology
For the duration of this report I will use the follow terminology:
Geometric Object or Collision Shape The shape of the rigid body being represented in the
dynamics world. This can be the shape of the rigid body or a group of Geometric Objects.
They have geometric properties such as size, orientation and position, but no dynamic
properties such as velocity or mass.
Space Object A particular type of geometric object that represents the volume of a rigid body.
A space object doesn’t have to correspond with the geometric representation of a rigid
body but space objects usually encapsulate the rigid body. An example of space object is
a bounding box.
Rigid Body An object in the physics world that has dynamic properties such as mass and
velocity. Rigid bodies cannot succumb to deformation.
Bounding Volume A space object that entirely encapsulates a rigid body. They can be any
shape, but are usually spheres, oriented bounding boxes or axis-aligned bounding boxes.
Stacking The action of placing many dynamic objects on top of each other. Stacking usually
refers to piles of objects but can also refer to pairs of objects, usually in resting contact.
Stacking is used in the context of stability of physics. If a simulator can successfully stack
objects without small “jittery” movements the simulator is relatively stable.
Collision Pairs A collision pair is the pair of objects A and B that the broadphase has decided
requires further intersection testing.
Chapter 3
Investigation
“The real goal of physics is to come up with an equation that could explain the
universe but still be small enough to fit on a T-shirt”
Leon Lederman
3.1
Physics Engine Analysis
The process of constructing a physics engine is a long and complex task. It requires a good
understanding of vector mathematics, Newtonian laws and linear programming. Constructing
even a simple engine would have taken the entire duration of the project and even then only been
able to simulate a small subset of object types. In comparison, current rigid bodies simulators can
simulate complex meshes, constrained objects and vehicle dynamics. Using an existing dynamics
engine was the logical choice for demonstrating a solution to sporadic cases. I chose to use an
existing simulator for the following reasons:
• To avoid the steep learning curve of creating a physics engine from scratch.
• Large set of existing functionality - Physics simulators already contain many different types
of collision primitives.
• Allow quick implementation of my proposed solution.
• To approach the problem from the perspective of a developer or researcher.
• To analyse a specific implementation related to the theory.
In this section we will compare our case study physics engine, Bullet, to the modular design
described by Erleben with the goal of identifying the cause of the collision overload problem
[18]. Bullet is a suitable choice for analysing because it is open source and an active project
with plenty of support from the “Physics Simulation Forum”[16]. Analysing Bullet is a task of
identifying structure, techniques used and relating it to theory from the books and research. The
reason for taking this approach is that implementation uncovers grey areas in the research and
address them (otherwise it wouldn’t work!). Two well designed techniques that work efficiently
separately, but together diminish performance, is kind of problem we are looking for. We will
revisit Bullet for the implementation of Scatter in Chapter 4.
44
45
CHAPTER 3. INVESTIGATION
Physics Simulation Loop
btDiscreteDynamicsWorld
Collision Detection
Collision Solver
Discontinuity
Signal
Collision Detection
Motion Solver
Broadphase
Broadphase
AABBs
Time Control
Time Control
Midphase
Narrowphase
OptimiseBVH
Motion Solver
Contact
Determination
SimIslands
Narrowphase
Penetration
Signal
GJK
STC Analysis
Constraint Solver
Modular Design
Constraint Solver
Bullet Modular Design
Figure 3.1: The modular design concept by Erleben (a), applied to Bullet (b)
3.1.1
Case Study: Bullet Physics Library
Bullet is becoming one of the more well known rigid-body physics engines available. Research in
2006 by Seugling and Rolin evaluated a range of physics simulators and gave Bullet an average
score of 2.6 of a maximum 5.0[40]. In less than a year Bullet has increased it’s feature set and
since then includes support for Collada (a standard for specifying dynamics), vehicles and has
its own implementation of a sequential impulse-based solver. The reason why we will look at
Bullet as a case study is that it is a research tool that features many of the latest techniques
from the physics developers community. At the outset of the project, it was likely that whatever
solution used would require the physics engine to be modified to incorporate it, that is why this
case study looks at the internal workings.
3.1.1.1
Modular Design
The design of Bullet is intended to be modular and not optimised for performance. This allows
the implementation of generic solvers that can be included when setting up Bullet. Bullet can
be split into two main areas:
Dynamics (btDynamicsWorld) - Deals with rigid bodies, solving motion, constraints and contact.
Collision Detection (btCollisionWorld) - Deals with the collisions, selection of algorithms and
can form a subsection of the dynamics.
For this subsection we will look at btDiscreteDynamicsWorld to analyse the simulation process.
Figure 3.1 shows how Bullet conceptually fits the modular design model by Erleben. The actual
implementation is mainly contained with the btDiscreteDynamicsWorld class. Each module can
be thought of as a set of function calls that solve a problem then return the result. A lot
of algorithms used by Bullet are hidden within the structure, for example simulation islands
(described in subsection 3.1.1.2) are contained within btDiscreteDynamicsWorld and manipulate
the persistent the data independently of other modules. The best place to intercept data is
CHAPTER 3. INVESTIGATION
46
by using the NearCallback function by specifying “MyNearCallback” as an argument for the
CollisionDispatcher, developers are able to catch the simulator at the point where two objects
overlap their bounding volumes. This is utilised in Scatter and helps provide the information to
trigger the model switching (see section 4.5.2).
3.1.1.2
Algorithms
Bullet has a number of built-in implementations described below that form the basis of the
working application. It also uses a few techniques to further arrange data and find tests that
don’t need to be performed. The following are a description of the algorithms used and why they
are beneficial to Bullet and simulators in general.
• btSequentialImpulseSolver - An implementation of Mirtich’s impulse-based contact resolution technique, recognised as a suitable approach for real-time simulation. This is the
main constraint solver used, but by implementing the btConstraintSolver interface it is
possible to use other solvers.
• btGjkEpaSolver and btGjkPairDetector - An implementation of the GJK EPA algorithm and GJK pair detection (see section 2.2.2). These are the default methods used for
convex collision detection in the narrowphase. There is also the option of using SAT (Separating Axis Theorem) for convex objects. Either of these algorithms would be suitable for
narrowphase testing and the choice is down to context and preference.
• btSimulationIslandManager - Bullet uses the concept of simulation islands, a technique
to group bodies based on their constraints and activation state (whether or not an object
is sleeping). It uses a union find technique to efficiently group bodies. It is possible that
simulation islands could be used partition the work of the simulator. The idea is mentioned
briefly in subsection 3.3.1, but is not an area of focus in this report.
• btCompoundCollisionAlgorithm - One of a number of classes that implement the btCollisionAlgorithm interface. The difference is that the compound Collision algorithm uses
OptimizedBVH (implemented using AABB trees, see section 2.3.1.3) which Bullet refers to
as the “Midphase” to reduce the number of tests to be performed by the narrowphase on
compound objects. The presence of this in the implementation helped to form the concept
of the solution in section 3.4. Without compound collisions, producing the implementation
of Encapsulation Levels in Scatter would have been difficult.
• btPersistentManifolds - Bullet uses the idea of “Manifolds” to cache contact points
between frames. Manifolds are effectively created every time an new collision pair is tested
with a certain algorithm (btCompoundCollisionAlgorithm for example). They also contain
an implementation of collision filtering, reducing the number of contacts to only four. This
helps stability and reduces calculations. Quantifying manifolds is difficult because they are
continually created and cached, making it difficult to see which manifolds are in use.
3.1.1.3
Improved Performance
By running Scatter with Bullet integrated it is possible to step through the execution of the
physics. Comparing the state of the world at the specified frame to the internal work done in
the engine revealed the following performance improvements.
• Static and Kinematic objects are excluded from motion calculations - Culling
of static objects and kinematic objects happens in many stages of the simulator. Static
CHAPTER 3. INVESTIGATION
47
objects don’t move and hence don’t require updating and kinematic objects (objects which
are moved by the user and don’t respond to collisions, but only cause them).
• Short iterations for numerical solvers - Bullet ’s sequential impulse solver uses 10
iterations by default to solve the contact and friction models.
• Midphase - Bullet uses what its refers to as a “Midphase” to improve the stage between
broadphase and narrowphase. The algorithm is only used for compounds and the implementation actually uses AABB trees. See van den Bergen for further information on AABB
trees[45].
3.1.2
Bottlenecks of Physics Simulators
Physics simulators are designed to fit a purpose and in that sense identifying bottlenecks is
related to the simulator design. A good technique can be badly implemented at the cost of
performance: for example, the sweep and prune method is a simple concept, but a non-trivial
problem to implement. Choosing a data structure to store the endpoints that can perform fast
swapping of sequential data is beneficial for the insertion sort approach used by Baraff. The
established techniques in simulators are those that are repeatedly used in different simulator
types: OBBs, AABBs, impulse-based contact resolution and LCP solvers to name a few.
3.1.2.1
Full Narrowphase Intersection Testing
The main area of bottlenecks is in collision detection, explicitly narrowphase intersection testing.
For convex polyhedra we need to find a separating plane, a pair of contact points and in some
simulators penetration depth. Some algorithms like GJK use distance to do this and can solve
penetration depth in the process. The complexity arrives in the form of a search, whether it
is looking for half spaces (to determine if a point is inside another object) or solving Linear
Programming (LP) problems. The cost of performing these tests is such that physics engines
will attempt to avoid doing the calculations as often as possible. Simulators must test all objects
against each other because every frame the world changes with the potential for any object to
collide with any other. From a cold start (when there is no prior knowledge to help the test) this
step is undesirable, but using frame coherent data we can improve the time complexity. Simple
spatial analysis tells us that all objects are not likely to be in the same space (Partitioning is a
potential area of performance improvement), so broadphase culling is used to eliminate collision
pairs that don’t overlap.
Bottlenecks appear to occur when narrowphase must be performed with:
• Little help from broadphase (most objects overlap)
• Low frame coherence (when separating planes or closest points must frequently be recalculated: rotating objects)
• Algorithm complexity of a collision pair is high (see subsection 3.1.2.5)
3.1.2.2
Unused Calculation Results
Frequently not using calculation results is not a clear bottleneck, but could potentially cause
degrading performance if not identified in an implementation. The idea is that from the uncertainty of physics, attempts are made to predict motions and collisions which are later ignored or
recalculated. Bullet makes a prediction of motion before entering the collision detection stage.
This is to the benefit of all the objects that don’t collide or have constraints, since their final
CHAPTER 3. INVESTIGATION
48
position predicted there needn’t be further calculations. There is a degree of uncertainty as to
whether any objects will collide so predicting motion is a logical step. The best improvement
that could be made to this area is probabilistic analysis to decide whether to predict or allow for
the cost of performing calculations when needed.
3.1.2.3
Maintaining Structures
Techniques that use complex structures to quickly find information always have the overhead
of structural maintenance. Regeneration of structures is often slower and must be performed
as infrequently as possible. Updating of AABBs is an example of such a structure that must
be regenerated every frame. The implementation relies on the efficiency of AABB regeneration,
which is faster than OBB regeneration [45]. Inefficient maintenance can be a potential bottleneck.
3.1.2.4
Excessive Contact Points
The more contact points between two objects the more contact resolution that needs to be
resolved after collision detection. It may seem obvious that this is a bottleneck in the contact
resolution stage, but often a small number of contacts points will suffice. Large numbers of
contact points on surface collisions cause instability. Consider the following situation:
“Two objects are touching via a face-to-face collision. Selecting a set of contact points
could change drastically between frames, because there are so many potential points
of contact. Simulators work on the principle of resolving contact. Uneven sets of
points cause models to move then the effect is amplified as contact occurs elsewhere.”
The result is oscillations and instability. Bullet uses “manifolds” that coordinate the contact
points and reduces them to 4 points to resolve the problem. Whether excessive contact points
is a bottleneck depends on implementation, but a good description of techniques, mentioned by
Moravanszky et al. in Games Programming Gems 4 is ensuring that developers are aware of the
problem [31].
3.1.2.5
Complex Intersection Algorithms
On the subject of collision detection, the preference is to calculate interactions between convex
objects. For meshes we usually attempt to enclose them in convex hulls or convex primitives
that are simpler to test. Create Dynamics by John Ratcliff that uses convex decomposition is an
example of such a library. The models used in Scatter have been created using the decomposition
techniques. Concave object collisions are usually avoided in favour of compounds of convex
objects. A compound is an object composed of various primitives in any structure. Although it
may seem to be just a subset of primitives, compounds have the added bonus that they don’t
require inter-primitive testing of any of their shapes. Work by Guendelman et al. involving
non-convex rigid-bodies with stacking using signed distance functions and triangulated surfaces
could be developed into a feasible technique for real-time, but for now complex concave objects
are avoided.
3.1.2.6
Generally Avoiding Bottlenecks
It is apparent that some of the methods of avoid bottlenecks are purely “good performance
programming techniques”. Using data structures that are fit for purpose, only doing the work
required unless the cost of doing it later as requested is too high. The areas that I have identified
can sometimes be attributed to a trade-off made by the designer. Accuracy is traded for speed
49
CHAPTER 3. INVESTIGATION
in number of iterations performed by numerical solvers. The hot topic in algorithms seems to be
reusing data between frames. Stepping through frames of a simulator and using support planes
and cached points to “warm start” numerical solvers is one such example. Using performance
data structures for file output is just one example used in Scatter.
3.2
The Collision Detection Bottleneck
To recap, the collision overload problem occurred when a collection of objects collided at a
single point. The ideal method of identifying the cause would be to recreate the scenario in the
original game and profile the game software to analyse how much time is spent performing which
functions. The problem with this approach is that it requires access to the code structure of the
game and also a license to use it. For most cases this would be difficult to do so and the outcome
would likely focus on the implementation of a specific physics simulator. This section views the
problem in relation to general physics engine algorithms.
From the research I established two probable causes of the drop in performance:
1. A sudden overlapping of broadphase bounding objects, requiring an increase in narrowphase
collision detection.
2. A method for breakable objects that results in a sudden increase in complexity.
To understand what the objects are doing when colliding we will start by looking at the broadphase. In the background we established that AABBs are the most frequently used form of
broadphase so I will refer to the case where AABBs are being used. Since broadphase collision
detection of n AABBs is usually of time complexity O(n log2 n + k) 1 (where k is number of
overlapping pairs) we can assume that in the motivation each object collision object in the world
uses a single AABB. If every object is close enough we will assume for the sake of the problem
(Number of collision pairs for n objects).
that every AABB is overlapping when k is n(n−1)
2
2
This is therefore a time complexity of O(n(log n + n)). This is probably a rare case since not
all objects n objects are likely to overlap, but it possible for a large number to do so. Physics
researchers recognise that the time to perform broadphase calculations is of lower significance
compared to performing narrowphase calculations. The significance of a large value for k is
that even after the AABBs are calculated, k pairs will still require narrowphase collision testing.
This means that the time complexity for an intersection query (assuming the use of AABBs)
is O(Broadphase) + ... + O(ith − phase) + ... + O(N arrowphase), where the ith phase is any
additional phases between broad and narrow (Bullet has a midphase described in section 3.1.1.3).
The average time to compute an intersection query is described by van den Bergen [46]. For
a sequence of intersection tests S1 , ..., Sn , let fi be the event that Si fails and Ci is the average
time to perform Si . The average time to perform an intersection query is:
Tavg =
n
X
P [f1 ...fi−1 ]Ci
i=1
where P [f1 ...fi−1 ] is the probability of failure of tests S1 , ..., Sn
The aim is to form a series of tests for which Ci and the probability of a test failing under the
condition of the former test having failed, are small. Estimating both P and Ci can be calculated
1 Using
output-sensitive algorithm presented by Six and Wood [41]
CHAPTER 3. INVESTIGATION
50
using profiling tools. Relating back the time complexity, if in the motivation a large proportion
of the objects are having to take all intersection tests to find the result, the total query time of
an object increases. Since the efficiency of collision detection relies at the very least on the speed
of the broadphase we can see the overlapping of AABBs of many object will have a big impact
on time if the next phase is not efficient. Given a simple broadphase object and a complex
narrowphase it is likely that the number of surfaces that need to be tested will increase. Simple
overlap has the potential cause the collision overload if the objects are complex enough. The
same effect is visible in Scatter, which uses Bullet and is apparent in the results.
Another probable cause is related to techniques of handling breakable objects. Decomposing
a single mesh object into a collection of separate sub-meshes and physics objects will inevitably
increase the number of objects in the world, n. It is common in video games to break larger
objects into smaller objects, this is usually done by having a selection of “gibs” that models
break into on destruction [12]. If the models are physics objects the set of gibs can be specified
by the developer and can be a collection of smaller objects or fitting pieces. These objects
are traditionally precomputed, but it is becoming more common to use real-time deformable or
breakable models2 .
Figure 3.2: Breakable objects diving into constituent elements.
Figure 3.2 shows the steps usually taken when dealing with breakable objects at the time of
destruction. First, the mesh is removed and replaced with the mesh of the gibs. With the
appropriate transform from the centre of the original mesh to the correct position and orientation.
The individual gibs are wrapped by the appropriate physics primitives. For complex gibs, these
can be precomputed. Left bunched together, it is likely that the gibs would intersect and collide
then explode in all directions in the contact resolution phase. This situation would be undesirable
for objects that need to appear to crack or fall apart. At this stage the gibs would be separated
far enough apart so that they don’t immediately disperse. At the end of breaking a model, there
are more physics objects than before the break. Complex models can have any number of gibs,
but this is usually limited to avoid these situations. Assume that the original mesh has kj physics
primitives representing it. Now assume that it has g mesh gibs each of which is represented by
2 For further reading see Real Matter for deforming soft bodies[38] and work by Bao et al on fracturing of rigid
materials [3].
51
CHAPTER 3. INVESTIGATION
pij physics primitives where pi is at least one. The number of new primitives in the world per
breakable is:
g
X
Obj j (new) =
pij − kj and Objj (bef ore) = kj
i=1
The reason why we take into account the kj existing primitives is because we want
to observe how many are added on breaking. In the case of Figure 3.2, the original
jug has has three primitive objects; two boxes and a sphere. If b is the number of
breakable objects in the motivation collision and r is the number of regular objects
(objects other than breakables, including static etc), then the total number of physics
primitives involved in the narrowphase after breaking is:
npaf ter
=
b
X
j=1
=
=
Obj j (new) + Objj (bef ore) +
r
X
(kl )
l=1
r
g
X
X
(kl )
pij − kj + kj +
l=1
i=1
j=1
b
g
r
X
X
X
+
p
(kl )
ij
j=1
i=1
l=1
b
X
where npbef ore =
b
X
j=1
(kj ) +
r
X
(kl ) and n = b + r
l=1
The worst case scenario would be when the maximum number of gibs that any breakable splits
into is gmax , assuming each gib is a primitive p = 1, the maximum detail of an original breakable
mesh is kmax and every object in the world collides is this kind of object, hence r = 0 and
npbef ore = n × kmax . We can show that npaf ter = n × gmax . So the increase in primitives
. From the Figure 3.2, we can see the
involved in narrowphase in total is by a factor of kgmax
max
original mesh was 3 primitives and the resulting gibs are 7 primitives the factor increase would
be 73 or 2.3̇. The factor is constant and will more than double the number primitives when
colliding. Comparing all narrowphase primitives would then require the testing of np(np−1)
pairs
2
2
gmax
so we gain the approximate increase of pairs pairsaf ter ≈ pairsbef ore × kmax .
In the sporadic cases where most objects overlap and therefore have to be narrowphase tested
we observe that complexity of the narrowphase collisions dictates the outcome. Gino van den
Bergen admits that since worse cases are rare, physics designers “often abandon hard real-time
requirements and shoot for optimal average timings” [46]. This indicates that it is possible to
encounter situations where performance will drop, most likely observed in the original problem.
CHAPTER 3. INVESTIGATION
52
This example of breakables shows a situation where small changes to physics objects can result
in large increases of calculations. The next section looks at the possible areas that can minimise
the impact of overlapping and discusses the suitability.
3.3
Solution Methods
This report looks at two areas of research that address the collision overload problem:
• Parallelisation of physics calculations - Identifying areas of the physics simulation
loop where parallelism can occur and utilising specific hardware and libraries. Parallelism
can be further divided into:
– Data level parallelisation - Where every physics object is arrange in such a way that
operations on it can be performed in parallel. This would require grouping of objects
that interact to reduce the overhead of transferring data.
– Task level parallelisation - Where each stage of the physics pipeline is broken down
into tasks, such as collision detection, contact resolution.
• Calculation reduction - Identifying situations where reducing the amount of calculation
would improve the performance. Calculation reduction covers:
– Model Reduction (Level of Detail) - Where the complexity of representative model is
reduced.
– Numerical Algorithm Improvement - Where new algorithms are designed that take
fewer steps or reduce the amount of work required.
This report briefly mentions parallelising calculations in relation to Bullet, but we highlight this
area for the purpose of future research. The remainder of this Chapter will look at “Level of
Detail” as a solution.
3.3.1
Parallelising Calculations
The response to multi-core technology and parallelisation is an area of research interest at the
time of writing. The background has mentioned a number of research articles relating to implementations on GPUs. With the arrival of technologies like CUDA (see section 2.10.1.5), GPU
implementations of physics engines is current topic (See section 2.8.1.1). The motivation for
physics parallelisation appears to be driven by next-generation console development. Multiple
cores have become an area of focus for physics developers. Erwin Coumans, the author of Bullet
is involved in a port of the Ageia PhysX engine to the PS3. Kokkevis et al. describe implementing physics on the CELL architecture [27]. They look at parallelising four parts of the physics
loop across the PPU3 and SPUs of the CELL processor. The suggested areas are noted below:
• Narrowphase Collision Detection
• Narrowphase Contact Point Determination
• Constraint Preparation
• Constraint Solving
53
CHAPTER 3. INVESTIGATION
Broadphase
Timer Control
Broadphase
Timer Control
Midphase
Midphase
Motion Solver
Motion Solver
IT
Integrate Transforms
IT
Narrowphase
IT
Narrowphase
Collision Detection
CD
CD
CD
CP
CP
CP
Constraint Solver
Contact Points &
Penetration Depth
Constraint Solver
CP
CP
CP
CS
CS
CS
Constraint Preparation
Constraint Solve
Bullet with possible parallelisation
Bullet (Modular)
Figure 3.3: Possible data parallelisation in Bullet
Figure 3.3 shows how the techniques of Kokkevis would appear in the Bullet architecture. It is
possible that the collision detection and contact point determination in the narrowphase could
be combined into one logical task, mainly because the GJK solver in Bullet detects contact
points as part of the implementation. Task level parallelisation is difficult because each task
requires a complete set of data from the previous task. For example contact resolution can be
calculated once all contact points are known. Data level parallelisation on the other hand is
suitable for physics. Using Bullet as an example, we could calculate the broadphase as a single
task then arrange the data by simulation islands. Assigning islands to be run in parallel and
only synchronising when data sets overlap.
The problem with this approach is that it still doesn’t directly address the contact overload
problem caused by all contacts in the same group overlapping. Any improvements brought
about by data parallelisation are likely to improve performance of physics as a whole and not
just contact overload. Parallelisation is natural progression improving performance indirectly
and is therefore subject to future work. I refer the reader to the various papers that have looked
at parallelisation.
3.4
Level of Detail
This report proposes a level of detail technique called “Encapsulation Levels” to address the
problem of collision overload. We have discussed in the background how level of detail is applied
to graphics to improve performance. One reason why we would attempt to apply level of detail in
physics, is to benefit from the user’s diminishing perception of physical detail in a scene, similar
to the way graphics benefits from visual perception. In a complex scene with many objects,
the individual interactions between the objects become less of a concern to the user when faced
with globally comprehending all the interactions. When we are not sure what to focus on we
3 Refers to the “Power Processing Unit” of the CELL architecture and not to the “Physics Processing Unit” of
Ageia Technologies
CHAPTER 3. INVESTIGATION
54
will continuously switch what we are looking at, effectively distracting us from the detail of
interactions.
Physical detail of objects is currently a concern, but for reasons of computational efficiency. It
infeasible to represent physics collision shapes at the same level of detail as graphics meshes, not
least of which because of the complexity of concave collision detection. Instead, developers opt
for simpler models to represent complex shapes. The decision is related to a trade-off between
computational efficiency and user perception of detail. Developers and users would prefer to
have more detailed objects but are limited by performance requirements.
Encapsulation Levels works on dynamic switching of models in an attempt to provide an
overall higher level of model detail without the performance degradation experienced in cases
like collision overload. Switching to a simpler model can reduce the number of calculations
required in these cases and hence improve the performance. Unfortunately switching models is
non-trivial problem. We need to establish when switching is likely to benefit, how quickly can we
respond to a situation and what the requirements are for switching a model. This section details
the problem of model switching and then in Chapter 4 we look at implementing an example in
Scatter.
3.4.1
User Perception
In graphics, perception can be measured mathematically in terms of how many pixels represent
an object and hence what level of detail a user will see can be quantified. In physics this is a
little more difficult. We are concerned with “look and feel” of objects and expectations of what
will happen. Consider the following example:
“In a physics simulator a user is given the task of arranging objects by moving them
around. The user’s focus is directed at a small group of objects where the interactions
with these objects is quite intricate. Placing a complex object with a rounded edge
on a flat surface, we would expect the object to attempt to roll on the rounded edge.
If instead of rolling it began to pivot we would perceive the motion as infeasible”
This perception relies on the focus of the user. For many objects that are colliding continuously we
don’t require the same level of detail because reactions are so quick that the user wouldn’t notice.
O’Sullivan and Dingliana have written a paper on “Collision and Perception” that attempts to
address what a user can perceive and how performance can be improved [33]. They draw three
important conclusions from the work:
• Erroneous collisions in the user’s periphery are less likely to be detected.
• Anomalies that occur between homogeneous distractors are less obvious.
• Time delay between collision and response reduces the plausibility of the collision.
They conclude that it is possible to produce random collision responses that are as believable
as the accurate ones, mainly because as complexity increases humans rely on common-sense
judgements of dynamics that are inaccurate. This supports the idea that reducing model detail
during collision overload will have less of an effect on the user’s perception.
3.4.2
Encapsulation Levels
When formulating an idea for changing level of detail we need to address the following requirements:
CHAPTER 3. INVESTIGATION
55
1. Changing the level of detail must preserve system stability.
2. Changing the level of detail must be computationally feasible in real-time.
3. Changing the level of detail must not degrade the plausibility of collisions.
The concept of Encapsulation Levels comes from these requirements. The idea is the following:
“Encapsulation levels is when each object has a number of n discrete levels of detail.
The base level “1” is the lowest amount of detail acceptable for representing the object.
Level n is the highest level of detail and hence to ideal representation of the object. n
can be a large number or as little as two. The requirement is that each higher level of
detail is contained within the previous level of detail. This preserves the accuracy of
the shape and allows us to make the following assumption. If level i, where 1 ≤ i < n
, is not intersecting then levels (i + 1, ... , n − 1, n) are not intersecting either. We
therefore don’t have to make any further calculations. If a model is at level i and
is colliding, it is possible to switch to a higher level of detail without disrupting the
system stability. To reduce the level of detail we need to make a single intersection
th
test of the (i − 1) level to determine whether we can switch to a lower level of detail
whilst preserving stability.”
This concept satisfies the 1st requirement. To satisfy the 2nd and 3rd, we must apply two
restrictions:
The discrete levels of detail must be precomputed to avoid the overhead of
calculating on the fly as a switch is made:
This allows the models to be algorithmically generated provided they are contained
within each other. We are also able to design the level of detail by hand to get the
best “performance to accuracy ratio” of model switching.
Each level of detail must provide a plausible response to collisions in the
situation it is invoked:
An object colliding at high velocity could plausibly use a low level of detail, but an
object interlocking with another object would require a high level of detail to be
plausible.
Figure 3.4 shows how encapsulation levels work for complex objects.
3.4.3
Requesting a Level of Detail
Switching a level of detail can’t necessarily be performed as required, because we must first test
safety conditions associated. We could try to switch up a level at any instance because level
i + 1 is not intersecting by definition. Using safety conditions prevents us from causing chaos by
switching at the wrong time. Consider the following safety conditions:
• “Only switch up a level if there are no contact points with other objects” - Figure 3.5 shows
the reaction of not using this condition
• “Only switch down a level if the i − 1th level of detail is collision free” - Figure 3.6 shows
the reaction of not using this condition
CHAPTER 3. INVESTIGATION
Figure 3.4: A diagram showing “Encapsulation Levels” of a lamp and a mug
Figure 3.5: The problems of increasing level of detail without safety conditions
Figure 3.6: The problems of decreasing level of detail without safety conditions
56
CHAPTER 3. INVESTIGATION
57
To enforce conditions like this we will use the the idea of making “requests”. When it is felt that a
model switch would be beneficial, a request is made to change the level of detail. The request will
persist until either the model is switched or the request is retracted. The next section describes
how we decide when to make requests.
3.4.4
Global, Group and Local Policies
Policies for switching models can be written for different levels of scope: global, group and local.
Each policy relies on heuristics available to make the best decision about when model switching
would benefit. Heuristics such as proximity on a local level or the “group size to group range
ratio” (Comparing the number of objects in a group to the space that the set of objects occupy)
on group level can be used to request model switching. O’Sullivan and Dingliana note that
adaptive detail (local level) is preferable to reducing the complexity of the whole scene (global
level) to achieve the target performance. Local adaptive detail doesn’t address the issue of high
computational load. In these situations it can be preferable to reduce the global complexity to
achieve a minimum frame rate if such a situation arises. The condition is that the user perception
of frame degradation is greater than the user perception of physics plausibility.
Using policies we can decide when is the best time to switch. Figure 3.7 shows a flow diagram
describing when requests to switch can be made. Each level of scope of the diagram has access
to certain heuristics:
• On a local level:
– When in proximity to other objects inform a low level policy manager about a potential
collision.
– When velocity is over a threshold, inform a low level policy manager about a potential
collision.
• On a group level:
– When the number of objects exceeding local thresholds is high, inform a group level
policy manager about a potential collision.
– When the range of positions, compared to the size of the group is over a threshold
ratio, inform a group level policy manager about a potential collision.
• On a global level:
– When
policy
– When
policy
system performance has dropped below a threshold, inform the global level
manager to request a model reduction.
the number of overlapping groups exceeds a threshold, inform a global level
manager about the potential collision.
Analysis of these heuristics can help make better estimations of when switching is preferable.
A simple implementation would trigger a request every time a condition is met. Having a policy
manager could provide a way of tracking the values and making these decisions. We are interested
in the effects of simple triggers because they have the lowest overhead. For this reason, we test
our implementation for the base case and propose more detailed analysis for future work.
58
CHAPTER 3. INVESTIGATION
Global
Group
Local
no
Local
Knowledge
Detect Global
Knowledge
Detect Group
Knowledge
Detect Local
Knowledge
Proximity
Collisions
Velocity
Mass
no
no
Global switch
requested?
yes
Group switch
requested?
yes
Over local
threshold?
yes
yes
Request
retracted?
Request a switch
no
Global
Knowledge
Group
Knowledge
Prior level is
collision free?
no
Wait a specified
time
yes
Group overlap
Physics Performance
Global Frame Rate
Global profiling
Group size
Complexity
Local thresholds
Group profiling
Make a switch
Figure 3.7: Global, group and local decision making for model switching
3.4.5
Investigation by Implementation
By implementing a version of Encapsulation Levels in Scatter we aim to evaluate the following:
• How can we quantify the gaps between levels of detail?
• What is the lowest level of detail we can use that provides plausible collisions?
• What is the performance improvement of the system whilst using the technique?
• What are the best heuristics for requesting model switching?
• What is the speed of response to a switch request?
Chapter 4
Implementation
“In theory, theory and practice are the same. In practice they aren’t even close”
Unknown
To support the investigation of encapsulation levels I developed an implementation using my own
physics framework known as Scatter. Scatter is a basic API that allows for quick prototyping
of physics scenarios without the problems of having to incorporate a renderer. Scatter uses a
modular design to abstract the physics simulator from the renderer, potentially allowing different
renders and simulators to be used to test the same scenario. The intention was to be able recreate
problems, observe them and interact with them. It was designed to emulate a simple game loop
with the aim of prototyping performance improving techniques. The following is the list of stages
encountered to finally implement encapsulation levels:
• Investigate renders and physics APIs.
• Select a combination on which to base the design of Scatter (The renderer used was Irrlicht
and the physics engine was Bullet ).
• Build the framework with the requirements for prototyping physics techniques.
• Incorporate Encapsulation Levels into Scatter.
• Experiment with the result and identify areas that require improvement.
• Run the performance evaluation tests and analyse the results of implementation with and
without encapsulation levels and model switching.
The requirements set for Scatter are grouped by functionality:
• The System should:
– Show the difference between the physics representation and the renderer representation
of any world object.
– Be able to create “scenes” that allow the developer to write a test scenario.
• Performance Monitoring should be able to:
60
CHAPTER 4. IMPLEMENTATION
61
– Monitor the CPU load.
– Attribute performance of the application to each task (Profiling).
– Record the results for analysis (Output frame data).
– Provide visual runtime information (Profiling on screen and render-able debug).
– Run with a low overhead.
• Rendering should:
– Be able to provide graphical detail to emulate a game environment.
– Be able give the impression of an actual game scenario.
– Be independent of the physics engine.
• Physics should:
– Allow modifications to the implementation to test physics algorithms (The use of
different collision solvers etc).
– Be able to run with and without the added algorithms for analysis purposes.
– Be able to apply forces to objects to invoke reactions of world objects.
4.1
The “World” Model
In the research leading up to the design of Scatter we have seen a range of physics APIs. Scatter
uses the world model to integrate physics and rendering. Figure 4.1 shows how the model stores
the physics and render objects in a “world”. Each object has attributes that provide information
to each of the sub-systems such as position (physics, renderer), active (physics). Implementing
interfaces gives objects functionality, such as implementing the “Sound interface” would allow
a developer to trigger sounds from the object. World objects in Scatter are render-able and
physical for the purpose of demonstration.
62
CHAPTER 4. IMPLEMENTATION
Application
Attributes
World
World Objects
Visible
Mesh
Lighting
Ambient
Sound
Radius
Mass
Shape
Renderable
Audible
Physical
Submarine Object
Sound
Renderer
Interface
Update Position
Get Position
Load Mesh
Physics
Simulator
Video
Renderer
Figure 4.1: The world model used in a simulation application
4.2
Timing and Game Loops
The main loop of Scatter was designed to reflect a standard game loop. Structured as a Singlethread Uncoupled Model (terminology described by Valente et al, see section 2.6), Scatter was
designed to use a structure that synchronised on a fixed frequency, similar to that used in console
development where the performance of hardware is fixed. This allows Scatter to fix the system
frame rate and hence the maximum rate that the loop will update. In a variation on a Singlethread Uncoupled Model, the implementation uses clocks, timers and sub-timers to control the
rate of update of each component. Figure 4.2 shows how the clocks, timers and sub-timers
can be used to run the loop. Clocks are used by the system as the most accurate timing.
The clock time is unmodifiable and the rate is fixed to real world time. The implementation
SCWin32Clock, which is the default clock for the Windows platform uses the operating system’s
built-in performance timer. Timers are used by the main system timer running the loop. The
idea is that timers can be paused and reset to control the execution rate of the whole simulation.
Sub-timers, based on timers, conceptually have a rate at which they run faster or slower than
the rate of their source. This allows each component to run at different update rates giving the
loop a decoupled feel. The implementation is a simplified example, but is demonstrated with the
ability of the user to “pause” the entire physics simulation. Figure 4.2 shows the structure of
the loop and the different functions called from it. The blue shapes represent functions that use
the “System timer” to update and synchronise, the green shape is the physics function that runs
on its own timer allowing it to update at a different rate to the orange renderer. All timers are
updated and controlled by the loop manager, who can monitor the different timers.
63
CHAPTER 4. IMPLEMENTATION
Scatter Main Loop
Loop Flow
Timers
Update Timers
Loop
Management
Update CPU
Process input
System
Clock
System
Timer
Update Physics
Physics
SubTimer
Update Profiling
Update Debug
Update Renderer
Update GUI
Scene
World
Update Frame Data
Render
GUI
Environment
Frame Data
Update Frame
Data
Loop
Synchronisation
Figure 4.2: The main Scatter loop and timing system.
Render
SubTimer
CHAPTER 4. IMPLEMENTATION
4.3
64
Built-in Profiling
Profiling in Scatter provides the best way of tracking performance of the sub components. Having
a built-in profiler allows the components to read the data as it is recorded. SCFrameData provides
a quick read interface that allows each component to read the data back, up to the length of the
output buffer. The retrieved data can be used for analysing the variance of a certain function
overtime with the goal of identifying patterns that indicate behaviours of the physics such as
suspected “stacking” or the build up to a large collision.
4.4
Scatter API
Scatter provides the following key classes for creating scenes and running the system:
• AFApplication - An example of an application using the API. AFApplication uses the main
loop and the timers to run the application.
• SCPhysics - The physics interface that allows the application to update the physics world.
• SCRenderer - The renderer interface that allows the application to run the renderer.
• SCBulPhysics - The Bullet implementation of SCPhysics class. This class wraps Bullet
allowing the application to set the gravity and other variables. The class handles all the
Bullet specific implementation like setting callbacks.
• SCIrrRenderer - The Irrlicht implementation of SCRenderer class. The class wraps the
running of Irrlicht and allows the application to register classes to be rendered.
• SCWorld - Contains the objects of the world and updates them during the loop.
• SCClock - The fixed rate timing source of the application.
• SCTimer - Uses the Clock as the base time to perform timing for the system. Can be
started and stopped.
• SCSubTimer - Uses the SCTimer as the base time to perform timing for the physics and
renderer. Can be started, stop and can be run a different rate to the base time.
• SCWorldObject - An object in the world. Has aspects of both physics and renderer and
converts between the two when updated.
• SCBulPhysicsObject - The interface that gives an object physical properties.
• SCIrrRenderObject - The interface that gives an object render-able properties.
• SCHybridObject - Represents an object in the world that has more than one physics representation (Part of the encapsulation levels implementation). It is able to switch between
representations by requesting to change the level of detail. Hybrid objects will only be
able to switch if the world in which they belong is running the “Hybrid” implementation
(discussed in section 4.5.2).
• SCSphereObject - An example of one of the primitives for the world. The object wraps up
all implementation behind a simple constructor.
CHAPTER 4. IMPLEMENTATION
65
• SCScene - Used to set and update the different scenes running in the application. The
scene represents all the details about position of lighting, cameras, objects.
• SCFrameData - Used to store and read data between scenes that are logged to file. Used
to record all the heuristics in the application.
• SCDebugDrawer - Used to collect and draw the extra debug information. It takes debug
output from the physics to draw in the renderer.
4.5
Integrating Encapsulation Levels
Using Scatter as a basis for implementing the level of detail model switching, I made the following
choices for implementation:
• To test the effectiveness of multiple switching, Scatter ’s can use n levels of detail, each
specified with an associated COLLADA physics data file (a standard for specifying physical
attributes).
• This particular implementation of Encapsulation Levels is based on “proximity”. From
the investigation we concluded that global switching requests would benefit performance
improvement in rare cases, but can cause undesirable effects if applied to all world objects.
This implementation in mainly concerned with local requests, but has been designed to
allow global requests (see future work 6.5). Global requests can be performed at a world
object level from within the SCWorld class.
• Add the functionality of “Encapsulation levels” as an additional feature to Bullet by creating
a new btHybridRigidBody and btHybridDynamicsWorld.
• All the functionality should be added between the SCBulPhysics and btHybridDynamicsWorld, keeping the implementation behind the Scatter API. Figure 4.3 shows the structure and flow of requests and switching.
4.5.1
Hybrid World
To perform switching Scatter uses “Hybrids”. The word meaning “mixed composition” refers to
any part of Scatter or Bullet that deals with model switching. In Scatter the implementation is
based around the idea of requesting and switching. There are three new components added to
Scatter, enabling the implementation:
• SCHybridObject - An object in the Scatter API that allows the users to “add” multiple
COLLADA objects from file representing the levels from 1 to n.
• btHybridRigidBody - The internal representation in Bullet of a hybrid object. A modification of the btRigidBody with the extra functionality of being able to store multiple
collision shapes.
• btHybridDynamicsWorld - A modified version of btDiscreteDynamicsWorld that controls
the additional testing and and triggering associated with switching.
66
CHAPTER 4. IMPLEMENTATION
Scatter
SCWorld
Bullet Physics
SCHybridObject
Request
ack
Request
SCBulPhysics
btHybridDynamicsWorld
Register request caller
Lookup btHybridRigidBody
Request
Lookup hybrid dynamics
Request
Global
Trigger
Sucess
Forward Request
Request
Ack
Forward request
Request
Group
Trigger
Request
Request
Group
Trigger
Local
Trigger
Request
Request
ack
btHybridRigidBody
Callback
Switch Detail
Sucess
Internal Step
Sucess
Success
Find callback
Forward callback
AttemptSwitch
Overlapping
No Contact
Broadphase
Narrowphase
Figure 4.3: The flow of requests and switches across the Scatter/Bullet boundary.
CHAPTER 4. IMPLEMENTATION
67
Figure 4.3 shows how an SCHybridObject (the object representation in Scatter ). Sends a request to SCBulPhysics that invokes the a similar function in the btHybridDynamicsWorld that
the adds a request to switch. If during the normal execution of the physics loop, the btHybridDynamics world decides the object is able to switch it will then perform the switch and inform the
SCHybridObject via a callback. We can think of this as “SCHybridWorld objects cause requests,
while btHybridRigidBody objects respond by switching.”
4.5.2
btHybridDynamicsWorld
The btHybridDynamicsWorld is the core of the implementation. Experimenting with Bullet callbacks revealed that classes in “Scatter space” (On the Scatter side of the Scatter/Bullet boundary)
could successfully retrieve information like number of manifolds, contact points and other internal Bullet values. What the callbacks couldn’t do was update information at different stages of
the loop. Creating btHybridDynamicsWorld gave enough control to modify the executions and
catch test conditions.
4.5.2.1
Additions To the Bullet Loop
The implementation added two stages to the Bullet loop. One to attempt to switch the models
and one to catch the lower detail model tests. These function calls are made at the start of the
loop and after the collision detection stage respectively. We want to be able to test a lower detail
model then detect if it has collided. To do this we iterate over O(n) manifolds to gather the
information we need. Although this is a large number of manifolds (see the results in section
5.1.5), the time complexity of the operation is so small we can justify the operation. It is possible
to avoid iterating at all but that would require full implementation into Bullet, an aspect that
we want to avoid when prototyping. Catching a successful lower detail model test allows us to
switch the model for the simulation step.
4.5.2.2
Collecting Local Heuristics
SCNearCallback and SCCollisionAddedCallback are used provide the heuristics for “TotalObjectsInNarrowPhase” and “TotalNewContactPoints”. Single run-time operations ensure that we
can monitor these functions efficiently. Keeping a track of contact points help to ensure that our
objects are contact free and not overlapping (for increasing level of detail) .
4.5.2.3
Problems in Implementation
Although Bullet is open source, the classes were not designed for implementing techniques like
model switching. The following is a list of problems encountered and how I overcame them:
• Manifolds are continuously modified throughout the execution (they are persistent data) Manifolds are created on demand and removed randomly so it became tricky to keep a
track of which where active and which were deleted. The implementation uses a technique
that iterates over the manifolds with a low time complexity and tags the ones that are
important.
• Avoiding the temptation to alter base classes - I provide a simple binary compatible test
to detect the type of class and avoided having to implementation to more Bullet classes
than required. The result was easy run-time identification. It proved to be light-weight
and non-invasive.
CHAPTER 4. IMPLEMENTATION
4.5.3
68
Successful Switching
The final implementation could successfully switch to a lower level of detail within a frame of
detecting “no intersections”. Models that were unsuccessful at switching were tested in future
frames. Deciding how frequently to test is an issue for policy analysis. An aggressive form of
testing could provide more problems. For this system, I attempted aggressive testing to find the
impact it had. The results of testing showed that the impact was higher with higher complexity
test models. I concluded that intermittent testing approximately every second would reduce the
impact.
When objects cease to overlap, any requests made to switch are cancelled until they overlap
again. This situation means that objects won’t accidentally drop more than two levels of detail
without being in proximity. This fact shows that objects will only switch when prompted. The
next Chapter describes the results of testing the implementation.
Chapter 5
Evaluation
“You can’t prevent disasters, but you can diminish their frequency and severity”
Murphy’s Law of Risk
5.1
Performance Evaluation
To evaluate the performance of the Scatter implementation, I ran test scenarios to recreate
the collision overload problem. The tests compare Scatter using the “Hybrid” implementation
against a Scatter build without model reduction techniques. The aim was not only to evaluate
the performance of the implementation but also to observe how situations like convergence and
sudden impact manifest themselves in profiling output and in the load of the system. Feeding the
results of the output back into Scatter allowed for fine tuning and improvement. It also opened
up areas of investigation like “how to quantify the difference in level of detail”. Table 5.1 lists the
two different builds of Scatter : “ScatterDefault” and “ScatterHybrid”, for which both ran Scene
1 and Scene 2 as a test.
Scene 1: Two hybrid objects colliding
Scene 2: Game scenario: Space
Collision Overload
Scatter Build (Default )
Test 1
Test 3
Scatter Build (Hybrid )
Test 2
Test 4
Table 5.1: Table to show the tests performed for each build of Scatter
5.1.1
Test Conditions
In these test results we are still interested in measuring the activity for a period of time after
the collision, to observe the “aftermath”. For this reason, Scene 1 and Scene 2 are 1300 frames
and 1600 frames, respectively (45 seconds and 1 minute approx in execution time). The graphs
have frame number as the x-axis to avoid the inconsistency when run on different systems. To
1
seconds. All GUI data
make the scenarios comparable each frame is of a fixed time-step 60
70
71
CHAPTER 5. EVALUATION
output was turned off for the duration of the tests and the only information used beyond the
view of the normal scene was the rendering of the Bounding Boxes and Collision Shapes in
wireframe. This extra debug information was useful in identifying the success of switch requests
(flashing objects indicate a waiting request). The scenarios were run without user interaction so
that sequential testing was consistent. The tests were run on the following test system with the
following settings:
Machine
Processor
Memory
Graphics Driver
Operating System
Compiler
Compiler Optimisations
Notes:
Performance Laptop
2.33 Ghz Intel Core 2 Duo
2 Gb 667 Mhz DDR2 SDRAM
ATI x1600 Mobility Radeon
Windows Vista
Microsoft Visual C++ Compiler (Visual Studio 2005)
-O2 (Debug off)
The application was run with administrator privileges to
allow the collection of CPU performance data
Table 5.2: Test System Settings
5.1.2
Test Scenarios
5.1.2.1
Scene 1
Scene 1 was created to observe the basic interactions between two hybrid objects and how they
respond to colliding. The scene uses two complex compound objects to represent two spaceships,
which I will refer to as “Hunter” and “Fighter”. Hunter has a concave mesh (with a convex physics
representation) and was chosen because of the possibility of interlocking with other objects, a
situation that would provide useful information in terms of “user perception”.
The Fighter mesh is a shape that is difficult to fit a suitable convex decomposition to. It has
small wings that often sit outside the physics collision shape. The two objects are separated and
a field is applied that draws them to the centre of the scene. They make a single collision then
separate far enough that the AABBs stop overlapping. They are then drawn to each other and
make resting contact whilst slowly rotating about the centre of the scene. Figure 5.1 shows the
layout before the collision. The objects have two levels of detail, both starting at “level two”. The
trigger in this example is based on proximity (the overlapping of the AABBs). The expectation
was that the two objects would converge, switch models to the simpler form (level 1) and collide
using those models.
5.1.2.2
Scene 2
If Scene 1 is the sandbox, then Scene 2 is the desert (the same thing, but larger). The purpose
of the scene is to pull all objects into a central point so that all AABBs overlap effectively making
the broadphase testing redundant. To achieve this all objects are placed randomly on a radius
CHAPTER 5. EVALUATION
Figure 5.1: Scene 1 : Left: the Hunter mesh. Right: the Fighter mesh
72
CHAPTER 5. EVALUATION
73
Figure 5.2: Scene 2: Spaceships (Hybrid Models), Asteroids (Basic Spheres) and the Sun (Large
Sphere).
CHAPTER 5. EVALUATION
74
around the central point so that the overlapping is progressive as they move inwards. They
all converge on the central point causing the degradation of the performance. After the initial
collision the objects fly off in different directions, some of which with enough velocity to escape the
field. The remaining objects continually collide and disperse until “the Sun” eventually reaches
the centre point indicating the end of the test. The motion can be conceived as an oscillation
with damping.
The hybrid objects in the world are two types of spaceship, Hunter and Fighter, both with
three distinct levels of detail. The other objects are small asteroids (non-hybrids), included to
distinguish hybrids from non-hybrids in the resulting graphs. Figure 5.2 shows the objects prior
to the point of convergence.
5.1.3
Performance Measures and Expectations
The main performance measures used in the tests were the ones that showed patterns relating
directly to the activity in the scene. The main example is the profiling of the physics that showed
clearly spikes in execution time where the objects overlapped. The list below shows the indicators
used to measure performance on a per frame basis. All bullet points under “Total Physics” are
sub-components of Bullet. The important ones are highlight in bold.
• Profiling Function Calls (Execution time per frame) in milliseconds
– Total Physics - The time taken to perform the all the stages of physics update
∗
∗
∗
∗
∗
∗
∗
∗
∗
Updating AABBs - Time to recalculate the bounding volumes
Prediction of Motion - Time to calculate motion before collision detection
Collision Detection - Time to perform narrowphase collision detection
Calculating Simulation Islands - Time to sort manifolds into simulation islands
Non-contact Constraint Solving - Time to solve all constraints that are not in
contact
Constraint Solving - Time to solve all other constraints
Integrating Transformations - Time to integrate all transforms, applying motion
Updating Vehicles - Time to update vehicles
Updating Activation - Time to update the activation
• Total Number of Manifolds (m) - Indicator of the total number of narrowphase tests performed (created for new tests)
• Total Narrowphase Collision Pairs (cpnar ) - Indicator of number of collision pairs that
require narrowphase testing.
– Containing a Hybrid (cphyb ) - Indicates whether one of the two objects in the pair is
a hybrid object
– Not containing a Hybrid (cpnon ) - Indicates the pair is two non-hybrids.
• Total Number of Objects (n) - Indicates whether a successful switch is made (number
decreases as the old representation is removed)
• Total Number of Contact Points (con) - Indicates the time at which collisions begin and
how many new contacts happen per frame
CHAPTER 5. EVALUATION
5.1.4
75
Expectations
To recap, manifolds in Bullet contain the contact points between collision pairs and perform
contact filtering of those points. Multiple manifolds are created for every collision pair that is
narrowphase tested. The expectation was that the total number of manifolds would be representative of the number of narrowphase tests performed. This number would increase with
the complexity of the models as more sub-components of the compounds require testing. The
expected result was a sharp spike in the graph at the point where most collisions are occurring.
Related to the manifolds, the total number of narrowphase collision pairs at time t was
expected to be around ≈ (noverlap )2 for the total objects n. For scenes combining hybrid and
non-hybrid models (such as Scene 2 ) the expectation was that the number of hybrid collision pairs
would be h(h + 2k) where h is total number of hybrid objects, k is the number of non-hybrids.
The number of non-hybrid collision pairs was expected to be k 2 .
Note that although Scatter currently uses additional non-hybrids to test for intersection
of lower detail models, this is not likely to be the case in the future. In the results these
extra intersection tests manifest themselves as additional non-hybrid collision pairs. In the
implementation there is a level of filtering to avoid two representations of the same object being
tested. The number of objects was expected to increase as model switching was attempted and
decrease as the models were successfully switched (when the test objects were removed from the
scene).
5.1.5
Results
The following graphs given a good amount of detail regarding the performance of Scatter under
the four tests. We will look initially at Scene 1 to understand the patterns caused by the simple
scenario of two objects and how the processes of requesting and switching are visible in the
results. We will then apply the understanding of Scene 1 to the more complicated graphs of
Scene 2 and how the patterns hold.
5.1.5.1
Test 1 and Test 2
For these tests the models used were composed of approximately 8 five boxes and 22 boxes for
the hunter mesh and fighter mesh respectively. For both tests 1 and 2, both scenes played out
identically until the point of overlap (between frame 140 and 180 in Figure 5.3). In the hybrid
build (test 2) the events can be described as follows:
CHAPTER 5. EVALUATION
76
Figure 5.3: Scene 1: Key frames of the Hybrid Build
• Both models heading towards each other (frame 80)
• Bounding Boxes overlap triggering a request to switch for both models (after frame 140)
• Hunter model switches successfully in the following frame
• Fighter model fails to switch but the request is acknowledged (visible in frame 180 as the
overlapping pink shape on the right)
• The objects collide and start travelling in different directions
• Bounding boxes stop overlapping and the switch of Fighter is successful (frame 200)
• Both objects are drawn back in by the field and overlap bounding boxes for the second
time (between 200 and 450)
• They eventually experience equal forces on each other and begin rotating slowly till the
end of the test
Figure 5.5 shows the full scene duration from frame 0 to frame 1400. The activity from the
collision is focussed between frame 150 and frame 230. The persistent manifolds (in red) are
associated with the left axis and the number of collision pairs in the narrowphase are associated
with the right axis. Figure 5.6 provides a better view of the activity. The default build (test
1) shows the (392) manifolds being created at the instance of the first collision. The number
only drops after the objects separate and stop overlapping. The second collision causes the (392)
manifolds to be added again. In the hybrid build (test 2) shows the number of manifolds is
initially more (424) than the default build due to the temporary testing of Hunter level 1. The
manifolds then drop and rise due to the testing of Fighter level 1 until the objects part. By
this point (frame 200) both models are at level 1. This means that when they overlap again the
CHAPTER 5. EVALUATION
77
number of manifolds is significantly less (20). Test 1 and 2 have shown the effect switching has
on reducing manifolds for future computations, now we will look at the graphs for Scene 2.
5.1.5.2
Test 3 and Test 4
On first inspection we can see immediately that the scene is a lot more complex. There are 10
Asteroids, 10 Fighters, 10 Hunters and a single Sun. The expectation was that the increased level
of detail of the models would not cause a problem until entering the narrowphase. This prediction
was correct and both test 3 and test 4 experienced a significant deterioration of frame rate as it
took up to 500 ms to calculate the physics (only leaving enough time to generate approximately
2 frames per second. In these tests the scene ran the same until the first point of collision (frame
280 approx) before that a number of objects in the hybrid build had already attempted to switch
models. Figure 5.4 shows the in frame 140 the hunter object has already switched before the first
impact. The Total Objects graph ( Figure 5.8) supports this claim because it shows an increase
in objects implying that a number of the hybrids are attempting to switch. The key events are
as follows:
Figure 5.4: Scene 1: Key frames of the Hybrid Build
• Objects start moving towards the centre point (frame 80)
• Adjacent objects start to overlap and cause a request (frame 140)
• Contact starts to occur up to a peak level of contact (between frame 360 and 380)
• Objects begin to disperse leaving space to allow some objects to switch (frame 420)
• There is a a second peak of contact between 450 and 500, likely to be caused by collisions
when dispersing
• Some objects continue to collide until the end of the scene where the sun eventually impacts
on the centre (frame 1600)
CHAPTER 5. EVALUATION
78
In the manifold graphs (with the axes arranged in the same way as scene 1) in figure 5.7, the most
striking observation is that the number of manifolds has reached 77874 for the default build and
66255 for the hybrid build. The number is approximately a combination of all overlapping objects
tested with each other, n2 , but it is dependent on too many variables to calculate accurately.
What the graphs show is that after the initial collision and the second collision the calculation of
manifolds (in red in both these graphs) trails off sharply at between 550 and 600, indicating that
most complex models have been switched. This is supported by the Total Objects graph, Figure
5.8, which indicates that number of objects is the same as before the contact and hence the test
levels must have switched out. Looking at the overall object graph we can see that it is similar to
scene 1 it takes a while (up to frame 1200) for all hybrids to successfully switch. This occurs at
least 600 frames after the benefits of switching have taken place. These findings led to the idea
of retracting switches (mentioned in the investigation). Using global analysis to decide when the
performance of the system is acceptable, we can retract any unsuccessful requests. This is an
example of a suitable global policy.
There are two further trends we can draw from the Total Objects graph. The first is that the
the number of contacts made is higher in the hybrid scene. For each unswitched hybrid object,
the test model for switching has no contact resolution (because we don’t want it to collide until
we are sure it can do so with causing instability, see section 3.4.3). The model will therefore
penetrate other models, which although doesn’t cause a problem, is undesirable. The graph
helped in identifying this case, which is rectifiable by filtering the contact point determination
step for test models (reducing the purple peak in the graph).
The second of the two trends is that Figure 5.9 shows the number of objects up to the peak
collision (frame 370) is still decreasing less than 25 frames before the major degradation. This
means that model switching is continuously reducing the calculations up until the last possible
point. This is obviously a desirable outcome and shows the success of the implementation.
5.1.6
Observations
As well as expected results, testing reveal a few unexpected ones. We know that the main
bottleneck in collision detection by overlapping of bounding volumes. The results show that the
key cause of the drop in performance in this set of tests was caused by the “Updating AABBs”
(Figures 5.10 and 5.11). The updating is still part of the broadphase, but the cause seems to
be the “midphase” of Bullet. Up until the point of overlap any hybrid object only requires an
outer AABB recalculation. As overlap occurs the Bullet midphase requires the updating of the
“optimisedBVH” described in the investigation. This supports the findings of the investigation
that the time taken is dependent on the chance of failing an intersection test and therefore having
CHAPTER 5. EVALUATION
79
to test further. It is unclear at this stage whether the performance drop is because of the Bullet
implementation of “midphase” or inherent in the testing of complex AABB trees. This is a point
for future study, but this report provides the ground work for furthering the research.
In summary, the peak representing the collision overload was smaller in all graphs for the
hybrid build compared to the default build. This suggests that model reduction improved the
situation in the lead up to the peak. The difference in the peaks is dependent on how big a
change of detail there is. This gap is limited only by the requirement to keep the collisions
“feasible” for the user. The results show that reducing a 22 box model to a single box model
model would improve the situation, but the observed collisions would look suspect. We therefore
used a more detailed model (22 boxes to 8 boxes). The next stage for implementation would be
to improve the response and efficiency in which models switch. The conclusion discusses using
physics models that don’t encapsulate the mesh, which would provide the extra space needed to
make the switch. The results show that even a simple implementation can cause improvements.
80
CHAPTER 5. EVALUATION
400
5
Persistant Manifolds
Total Collision Pairs in Narrowphase
350
4
250
3
200
2
150
Number of Collision Pairs
Number of Manifolds
300
100
1
50
0
0
200
400
600
800
Frame Number
1000
1200
450
0
1400
5
Persistant Manifolds
Total Collision Pairs in Narrowphase
400
Number of Manifolds
300
3
250
200
2
150
100
Number of Collision Pairs
4
350
1
50
0
0
200
400
600
800
Frame Number
1000
1200
0
1400
Figure 5.5: Scene 1: Manifolds and Narrowphase CPs. Top: Default Build. Bottom: Hybrid
Build.
81
CHAPTER 5. EVALUATION
400
5
Persistant Manifolds
Total Collision Pairs in Narrowphase
350
4
250
3
200
2
150
Number of Collision Pairs
Number of Manifolds
300
100
1
50
0
140
150
160
170
180
190
Frame Number
200
210
220
450
0
230
5
Persistant Manifolds
Total Collision Pairs in Narrowphase
400
Number of Manifolds
300
3
250
200
2
150
100
Number of Collision Pairs
4
350
1
50
0
140
150
160
170
180
190
Frame Number
200
210
220
0
230
Figure 5.6: Scene 1: Detail: Manifolds and Narrowphase CPs. Top: Default Build. Bottom:
Hybrid Build.
82
CHAPTER 5. EVALUATION
80000
140
Persistant Manifolds
Total Collision Pairs in Narrowphase
Collision Pairs with Hybrid
Collision Pairs without Hybrid
70000
120
60000
Manifolds
50000
80
40000
60
30000
Test Collision Pairs
100
40
20000
20
10000
0
0
200
400
600
800
1000
Frame Number
1200
1400
1600
80000
0
1800
450
Persistant Manifolds
Total Collision Pairs in Narrowphase
Collision Pairs with Hybrid
Collision Pairs without Hybrid
70000
400
350
60000
Manifolds
250
40000
200
30000
Test Collision Pairs
300
50000
150
20000
100
10000
50
0
0
200
400
600
800
1000
Frame Number
1200
1400
1600
0
1800
Figure 5.7: Scene 2: Manifolds and Narrowphase CPs. Top: Default Build. Bottom: Hybrid
Build.
83
CHAPTER 5. EVALUATION
50
300
Total Objects: Hybrid Scene
Total Objects: Default Scene
Total Contacts: Default Build
Total Contacts: Hybrid Build
250
45
Number of Objects
200
40
150
100
35
50
1600
1500
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
0
30
Frame Number
Figure 5.8: Scene 2: Total Objects and Contact Points (Both builds)
50
Total Objects: Hybrid Scene
Total Objects: Default Scene
40
35
Frame Number
Figure 5.9: Scene 2: Peak Detail: Total Objects (Both builds)
500
490
480
470
460
450
440
430
420
410
400
390
380
370
360
350
340
330
320
310
30
300
Number of Objects
45
84
CHAPTER 5. EVALUATION
Total Physics
Update AABBs
Predict Motion
Collision Detection
Calculate Simulation Islands
Solve Non-contact Constraints
Solve Contact Constraints
Integrate Transforms
Update Vehicles
Update Activation
500
Execution Time (ms)
400
300
200
100
1500
1600
1500
1600
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
0
Frame Number
Total Physics
Update AABBs
Predict Motion
Collision Detection
Calculate Simulation Islands
Solve Non-contact Constraints
Solve Contact Constraints
Integrate Transforms
Update Vehicles
Update Activation
500
Execution Time (ms)
400
300
200
100
1400
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
0
Frame Number
Figure 5.10: Scene 2: Physics profiling. Top: Default Build. Bottom: Hybrid Build.
85
CHAPTER 5. EVALUATION
Total Physics
Update AABBs
Predict Motion
Collision Detection
Calculate Simulation Islands
Solve Non-contact Constraints
Solve Contact Constraints
Integrate Transforms
Update Vehicles
Update Activation
500
Execution Time (ms)
400
300
200
100
400
395
390
385
380
375
370
365
360
355
350
0
Frame Number
Total Physics
Update AABBs
Predict Motion
Collision Detection
Calculate Simulation Islands
Solve Non-contact Constraints
Solve Contact Constraints
Integrate Transforms
Update Vehicles
Update Activation
500
Execution Time (ms)
400
300
200
100
400
395
390
385
380
375
370
365
360
355
350
0
Frame Number
Figure 5.11: Scene 2: Peak Detail: Physics profiling. Top: Default Build. Bottom: Hybrid
Build.
Chapter 6
Summary and Conclusion
“Games developers are like magicians, they mesmerise their audiences with fantastic
effects and beautiful worlds, but the real magic is how they simplify the illusion”
Ian Ballantyne, 2007
In this report we have discussed the background of physics simulation and looked in depth into
how a physics simulator functions. The investigation stage looked at the bottlenecks and the
areas of research that can help reduce the impact of collisions on performance. This report has
focused on “level of detail” as a technique to reduce the gap between performance of physics in
the standard case and in the rare cases of full overlapping. The proposed method of “Encapsulation Levels” and “Model Switching” has proved to be an initial improvement by reducing
the impact prior to the collision and accelerating the the return to stability after. The complex
task of developing Scatter revealed the difficulties of prototyping with an existing simulator, but
resulted in a framework for future testing of new physics techniques and especially level of detail
techniques. Scatter provides a foundation for more research into topics such as the analysis
of global and group heuristics to influence local model switching decisions. This final chapter
discusses the conclusions of this project and the best direction for future research.
6.1
Performance
The success of dynamic model switching relies on the improvement of performance. I achieved
this with Scatter and have concluded the following:
• The technique successfully improves the performance in the lead up to the initial collision.
The switched objects are simpler, but still have a complex representation.
• Even after the initial collision(s), switching helps the simulator return to a stable state.
• The graphs show that the hybrid implementation improves the execution time of the initial
collision and improves any successive collisions.
• The recorded values don’t give a clear indication that a collision is about to occur. Analysing
the gradient of the graph could be a good indicator, however this information is deceptive.
It could produce false positives, so more research is required in this area.
88
CHAPTER 6. SUMMARY AND CONCLUSION
89
• Switching is completed successfully in less than 3 frames (Both scene 1 and scene 2 demonstrate this).
• Model switching still occurs successfully during the start of degradation, showing it is not
too late to improve.
• The implementation takes over half the test time to perform all switch requests (bearing
in mind that all objects switch more than once). Retracting requests is the best way to
deal with this.
• Updating of the AABBs in the scene is the cause of the performance spike. It is likely that
the implementation of optimised bounding volume hierarchies (BVH) in Bullet, which use
AABB trees is the cause of the spike. Further investigation is required to identify whether
the problem is with AABB tree technique or a shortcoming of the Bullet implementation
for compound objects.
6.2
Scatter
Scatter provides the following features:
• A simple API for creating scenes to test in a physics simulator.
• A transparent wrapper for creating “Hybrid” objects and running them in a hybrid or
standard world.
• Profiling, CPU load and physics state variables such as “number of broadphase intersection
pairs” for each frame.
• Controls to pause physics and observe interactions with extra visual information:
– Bounding boxes
– Encapsulation levels
– Physics Collision Shapes
– Visual response to requests and switches
• Controls to manipulate objects in the scene for experimentation (to prompt certain responses).
Scatter is a fast framework for prototyping new physics concepts. It provides the output required
to test performance without the hassle of implementing an input and rendering module. It can be
quickly used to load meshes and COLLADA physics data, abstracting the renderer and physics
the from the implementation. Future work includes the addition of other renderer and physics
library implementations, such as Ogre and ODE respectively, that can be used to compare any
scene with different combinations of technology. This feature would be useful for developers
interested in comparing physics packages.
The implementation revealed that it would not have been possible to perform encapsulation
levels at the game developer perspective. Integrating into Bullet allowed more control than the
callbacks provided in the API (detecting the exact frame of contact in SCBulPhysics supports
this statement) .
CHAPTER 6. SUMMARY AND CONCLUSION
6.3
90
Level of Detail
From the report we can draw the following conclusions about level of detail:
• It is an effective technique in improving the performance in rare cases.
• These “rare” cases are becoming more common in game physics.
• Developers require detailed objects, but also to perform well (narrowing the gap between
accuracy and performance). Dynamic level of detail can achieve this.
• There is a lot of research that can be gathered from graphics techniques and applied to
physics.
6.3.1
Encapsulation Levels and Model Switching
• Encapsulation Levels are necessary for stability in model switching.
• They improve the performance when switching to a lower level of detail despite the additional complexity of having to test a less complicated object in addition to the current
level of detail. The efficiency of the test will determine the extent of which encapsulation
is an effective trade-off for improving level of detail.
• Switching to a higher level of detail requires no intersection testing and very little effort,
but care must be taken to ensure the feasibility of the change (see Figure 3.5).
• In terms of local heuristics, proximity is a good trigger, provided that there is a margin
around the largest level of detail that ensures switching can occur even when faced with high
velocity objects. The expected requirement is that the margin m should be greater than
the distance travelled by the maximum velocity in a single time-step (m > vmax ∗ ttimestep ).
This condition is required for future implementations.
• Visually the impact of encapsulation levels depends on the step between the levels of detail:
– Using bounding boxes as collision objects is undesirable but the best for performance.
– Frequency of switching affects the users only if they can observe a change in the type
of collision.
– Very low detail models produce collisions that are obviously infeasible. These infeasible collisions are only acceptable under the user perception conditions explored in the
investigation.
– Visually, the difference between having the debug AABBs and collision shapes on and
off is significant because without the debug the users have their own, often inaccurate,
perception of the collisions.
– Reducing the size of the physics representation compared to the mesh improves the
model switching at the expense of visual overlapping of meshes.
• There is a trade-off between “user perception” and “performance” when selecting the step
size.
CHAPTER 6. SUMMARY AND CONCLUSION
91
• The design of “encapsulation levels” is an attempt to improve performance before the point
of contact. The results show an improvement, but also that the technique may be suitable
for other more common scenarios like those found in physics-based games such as “Cell
Factor” where groups of objects are moved together. The response of the system in the
test scenario was to improve the performance as objects were given space to expand.
• Relaxing encapsulation levels is detrimental. Without encapsulation levels we can introduce
unnecessary energy into the system. It is possible for small amounts of overlap providing
that stability is not a requirement.
6.4
Implementation
This project used Scatter to provide extra functionality to Bullet with the aim of modifying
the implementation of Bullet as little as possible. Minimising the impact proved to be a huge
challenge, as the physics technique would have been more suited to full Bullet integration. Full
integration would have required the acceptance of the key classes, like primitive shape objects
and would have require a specific implementation of each of the many algorithms used. This
would have made Bullet dependent on the encapsulation technique, which is undesirable for
implementing prototype physics concepts. My approach was to create instances of the key classes,
btHybridDynamicsWorld, which provided the best compromise and only required the modification
of two Bullet classes: a single function to btRigidBody and an extra field in btManifoldPoint.
Integration of Irrlicht was a key feature of Scatter, but development revealed the nature of
joining two independently written systems. Although closely linked, physics and rendering fight
over control of the simulated game world. Each engine wants to be in charge of the coordinate
system and this results in many transformations to convert between the two. Scatter is a physics
based implementation, therefore rendering is updated from the physics. Table 6.1 shows a list of
the equivalent primitive types in both Bullet and Irrlicht and how they compare. Conversions of
orientation posed the biggest problem in Scatter, but in principle, this is possible in any pairing
of middle-ware technologies. The overhead of conversion could out way the benefits, which may
lead to developers writing their own implementation. It is unclear whether some projects are
written around the tools they use or whether they are written for the tools. A closely linked
renderer and physics simulator would be most beneficial. Work on multi-threaded game loops
will reveal more definite answers.
Bullet Type
btScalar
std::wstring
btPoint3
btVector3
btQuaternion
btVector3
Irrlicht Type
f32, f64, i32, u32 etc
std::string
vector3df
vector3df
vector3df
SColor
Available Conversion(s)
float to double, float to int, float to unsigned int etc
std::wstring to std::string and w_char* to char*
get and set methods
get and set methods
Quaternion to Euler
Vector r,g,b to SColor a,r,g,b
Table 6.1: Conversion between Bullet and Irrlicht primitive types.
CHAPTER 6. SUMMARY AND CONCLUSION
6.5
92
Future Work
Future work will continue to address the improvements level of detail can make to performance
and the ability to have dynamic object accuracy. Effort is still required to improve the efficiency
of level of detail. The work on parallelisation of calculations is on going, but can also be applied
to collision overload. The follow is a list of future work identified in this report:
• Furthering the field of level of detail by investigating “Adaptive level of detail
models based on algorithmic generation”. The research would be applicable to model
switching and encapsulation levels in both pre-computation and real-time. The latter of
the two is a lot more complicated and would require an efficient implementation. The start
point for this research would begin with an investigation into adaptive meshes in graphics
with the purpose of investigating “adaptive convex hulls for level of detail”.
• Application of “Encapsulation Levels” to deformable bodies. Encapsulation levels
has the potential to work with deformable bodies, but would require investigation into the
speed of regenerating the levels after deformation.
• Investigation into the performance of step-size between models. The trade-off
between performance improvement and user perception requires some quantitative analysis,
mainly because the area of user perception is very qualitative. These results of this work
could be a good indicator of “how much” a certain change could improve performance
• Investigation analysing the heuristics that dictate requests to switch. This report
defined the areas of global, group and local for requesting model switching. The implementation focused on local but could be easily extended to work with global requests. The
precursor to performing this adaptation is analysing the heuristics. Proximity is a good
indicator but would require higher level information to be efficient.
• Implementing Parallelisation techniques in Scatter to analyse the improvements. This report has given an example of where in Bullet parallelisation could occur.
Work by Kokkevis et al. with parallel physics on the CELL is a good starting point for
further work [27]. Kokkevis notes that paralleisation could hinder as well as help. Using
Scatter as a test-bed this could be implemented and tested.
6.6
Discussion
This research demonstrates that dynamic level of detail will become a much more important area
of study in the future of physics simulations. With increasing feasibility of what can be calculated
in real-time we will always aim to push the boundaries. I envisage games and simulations where
developers will attempt to allow users to demolish entire buildings or even cities, but still be
able to chip corners off the individual bricks or cause the intricate cracking of glass in windows.
Scale appears less of an obstacle when “level of detail” is involved, but the idea of being able to
perform the complexity of calculations without it is just as exciting. With virtual environments
as seemingly accurate as the real world, what limits could there be except the laws of physics
themselves!
Bibliography
[1] AGEIA. Advanced gaming physics. White Paper, 2006.
[2] Robert Bridson. Ronald Fedkiw. Joh Anderson. Robust treatment of collisions, contact and
friction for cloth animation. In SIGGRAPH 2002, volume 21, pages 594–603. ACM Press /
ACM SIGGRAPH, 2002.
[3] Zhaosheng Bao, Jeong-Mo Hong, J. Teran, and R. Fedkiw. Fracturing rigid materials. In
IEEE Transactions on Visualization and Computer Graphics, volume 13, pages 370–378,
2007.
[4] David Baraff. Analytical methods for dynamic simulation of non-penetrating rigid bodies.
In SIGGRAPH 89, volume 23 of Computer Graphics. Cornell University, Ithaca, NY 14853,
ACM Press, July 1989.
[5] David Baraff. Fast contact force computation for nonpenetrating rigid bodies. In SIGGRAPH, pages 23–34. Carnegie Mellon University, ACM Press, 1994.
[6] Gino Van Den Bergen. Efficient collision detection of complex deformable models using
aabb trees. Journal of Graphics Tools, 2(4):1–13, April 1998.
[7] Gino Van Den Bergen. A fast and robust gjk implementation for collision detection of convex
objects. Journal of Graphics Tools, 4, 1999.
[8] David Blythe. The direct3d 10 system. Technical report, Microsoft Corporation, 2006.
[9] Ian Buck. Taking the Plunge into GPU Computing, chapter 32, pages 509–512. GPU Gems
2. Addison-Wesley, 1st edition, 2005.
[10] Yan Zhuang. John Canny. Real-time simulation of physically realistic global deformation.
Technical report, University of California, Berkeley, California, USA, 1999.
[11] Erin Catto. Iterative dynamics with temporal coherence. Game Developer Conference, 2005.
[12] Valve Developer Community.
Information on prop data, 2006.
Available from:
http://developer.valvesoftware.com/wiki/Prop_Data [cited 1st October 2006].
[13] Jong-Shi Pang. Richard E. Stone. Richard W. Cottle. The Linear Complementarity Problem.
Academic Press, San Diego, California, USA, 1997.
[14] Erwin Coumans. Bullet collision detection and physics sdk. 2006.
http://www.continuousphysics.com/Bullet/BulletFull/main.html.
94
Available from:
95
BIBLIOGRAPHY
[15] Erwin Coumans.
Bullet collision detection faq.
2006.
http://www.continuousphysics.com/mediawiki-1.5.8/index.php.
[16] Erwin Coumans.
Physics simulation forum,
2007.
http://www.continuousphysics.com/Bullet/phpBB2/index.php.
Available
Available
from:
from:
[17] David H. Eberly. Game Physics. Interactive 3D Technology. Morgan Kaufmann, 500 Sansome Street, Suite 400, San Francisco, CA 94111, 2003.
[18] Kenny Erleben. Stable, Robust And Versatile Multibody Dynamics Animation. PhD thesis,
University of Copenhagen, Copenhagen, Denmark, 2004.
[19] Eduardo Tejada. Thomas Ertl. Large steps in gpu-based deformable bodies simulation.
Simulation Practice and Theory, 13(9):703–715, 2005.
[20] Emmett Kilgariff. Randima Fernando. The GeForce 6 Series GPU Architecture, chapter 30,
pages 471–491. GPU Gems 2. Addison-Wesley, 1st edition, 2005.
[21] Matt Pharr. Randima Fernando. GPU Gems 2: Programming Techniques for HighPerformance Graphics and General-Purpose Computation. GPU Gems. Addison-Wesley
Professional, 2005.
[22] David Luebke. Martin Reddy. Jonathan D Cohen. Amitabh Varshney. Benjamin Watson. Robert Huebner. Level of Detail for 3D Graphics. Morgan Kaufmann, 2003.
[23] David Baraff. Andrew Witkin. Michael Kass. Physically based modelling. SIGGRAPH
Course, 2001.
[24] E.G Gilbert. D. W. Johnson. S. S. Keerthi. A fast procedure for computing the distance
between complex objects in three-dimensional space. IEEE Journal of Robotics and Automation, 4(2):192–203, 1988.
[25] David Knott. Cinder, collision and interference detection in real time using graphics hardware. Master’s thesis, University of British Columbia, 2003.
[26] Evangelos Kokkevis. Practical physics for articulated characters. Game Developer Conference, 2004.
[27] Vangelis Kokkevis, Steven Osman, and Eric Larsen. High-performance physics solver design
for next generation consoles. In Game Developers Conference, 2006.
[28] S. Gottschalk. M. C. Lin. D. Manocha. Obbtree: A hierarchical structure for rapid interference detection. In SIGGRAPH, pages 171–180. ACM SIGGRAPH, 1996.
[29] Don Woligroski. Aaron McKenna. The best gaming video cards for the money: January
2007. Tom’s Hardware, January 2007. Available from: http://tomshardware.co.uk/.
[30] Brian Mirtich. Impulse-based Dynamic Simulation of Rigid Body Systems. PhD thesis,
University of California, Berkeley, California, USA, 1996.
[31] Adam Morvanszky and Pierre Terdiman. Games Programming Gem 4: Fast Contact Reduction for Dynamics Simulation, chapter 3, pages 253–263. Number 5. Charles River Media,
2004.
BIBLIOGRAPHY
96
[32] Intel Software Network.
Open source game development, 2007.
Available from:
http://www.intel.com/cd/ids/developer/asmo-na/eng/254761.htm?page=1
[cited
May 2007].
[33] Carol O’Sullivan and John Dingliana. Collisions and perception. Technical report, Image
Synthesis Group, Trinity College Dublin, 2001.
[34] Darren E. Polkowski. Geforce 8800: Here comes the dx10 boom. Tom’s Hardware, November
2006. Available from: http://tomshardware.co.uk/.
[35] X. Provot. Collision and self-collision handling in cloth model dedicated to design garment.
Graphics Interface, pages 177–89, 1997.
[36] Martin Reddy. Perceptually Modulated Level of Detail for Virtual Environments. PhD thesis,
University of Edinburgh, Edinburgh, Scotland, 1997.
[37] Martin Reddy. Visual perception and lod. Presentation, 2002.
[38] Alec R. Rivers and Doug L. James. Fastlsm: Fast lattice shape matching for robust real-time
deformation. Due for proceedings of SIGGRAPH 2007, 2007.
[39] M.Muller. L.McMillan. J.Dorsey. R.Jagnow. Real-time simulation of deformation and fracture of stiff materials. In Eurographic Workshop on Computer Animation and Simulation,
pages 113–124, Manchester, UK, September 2001. Springer-Verlag New York, Inc.
[40] Axel Seugling and Martin Rolin. Evaluation of physics engines and implementation of a
physics module in a 3d-authoring tool. Master’s thesis, UMEA University, March 2006.
[41] H.W. Six and D. Wood. Counting and reporting intersections of d-ranges. In IEEE Transactions on Computers, volume 3, pages C–31:181–187, 1982.
[42] X. Tu and D. Terzopoulos. Artificial fishes: Physics, locomotion, perception, behavior. In
SIGGRAPH 94, pages 43–50, July 1994.
[43] James Tulip, James Bekkema, and Keith Nesbitt. Multi-threaded game engine design.
Technical report, Charles Sturt University, 2005.
[44] Luis Valente, Aura Conci, and Bruno Feijo. Real time game loop models for single-player
computer games. Oct 2005.
[45] Gino van den Bergen. Collision Detection in Interactive 3D Environments. Interactive 3D
Technology. Morgan Kaufmann, San Francisco, CA 94111, 2003.
[46] Gino van den Bergen. Collision Detection in Interactive 3D Environments, chapter 2, pages
56–57. Interactive 3D Technology. Morgan Kaufmann, San Francisco, CA 94111, 2003.
© Copyright 2026 Paperzz