Making Games Sound as Good as They Look: Real

Making Games Sound as Good
as They Look
Real-time Geometric Acoustics on
the GPU
Sound in Modern Games
• Commercial middleware (FMod, Wwise)
• Hardware or software rendering (DirectSound 3D,
OpenAL, EAX)
• Some additional extensions to improve immersion – ex:
calculate occlusion (per source) and filter if occluded
• Existing systems are parametric, rather than simulations
Parametric vs. Simulation
• Parametric model: parameters are
derived from game state (reverb
wet/dry, room size, source
distance, direction, etc..)
• Parameters then map to DSP
variables which interact with
raw source (recorded samples)
• Game state never directly interacts
with signal processing
Parametric vs. Simulation
• Simulation model: game state
directly generates audio
• Advantages: audio experience
is more directly influenced by
game world, can be more
immersive if done right
• Disadvantages: can easily
sound “broken” if model is used
outside limitations - not simple
to fix by ‘fudging’ parameters
S.T.A.L.K.E.R.™ : Call of Pripyat
•
•
•
•
Basic positional audio (rendered through OpenAL)
Almost no environment dependence
Limited immersion, high performance
Typical for games where focus is on graphics
S.T.A.L.K.E.R.™ : Call of Pripyat
ARMA 2: Operation Arrowhead
• 2010 (PC only), continuously updated (version from late2012)
• Simulation focus, HRTF and “software” occlusion
• Rendering through OpenAL (although OpenAL cannot
process geometry)
• Focuses on sound as a gameplay mechanic
ARMA 2: Operation Arrowhead
Quake 3 Implementation
Q3dm1
• Acoustic version shown
(3186 triangles)
• 4096 rays to generate
reflections
• 3 orders of reflection specular & diffuse; HRTF
per-reflection (12k)
• IIR based material
descriptions
Example: Used in Game Engine
(Quake 3 Arena)
For video: https://www.youtube.com/watch?v=TXUTgEmnD6U (please use headphones if possible!!)
Geometry Engine
“Backwards” Ray-tracing
• Start at listener
• Use specular and diffuse
reflection approximation
at each bounce
• Generate impulse
response
• 1 ray per thread
Geometry Engine
• Finding exact paths vs. sampling sound field
• Impulse response acts as filter
• Each ray has different frequency characteristics due to
HRTF (different incident direction)
• Unique among real-time systems
Example: Corner Bass
Reinforcement
• Situation on right, both listeners
are facing source
• Listener 1 is in around center of
room, frequency response is
fairly flat
• Listener 2 is in corner, frequency
response has low frequencies
boosted
• Why is this?
Example: Corner Bass
Reinforcement
Example: Corner Bass
Reinforcement
Example: Corner Bass
Reinforcement
Performance
• Computationally expensive (scales to
Order*Sources*Triangles*Rays)
• Triangles count can be fairly low (use collision mesh
instead of displayed mesh)
• Single order BVH (bounding volume “hierarchy”) is
sufficient
Geometry Engine
Geometry Engine
Geometry Engine
Optimizations (Geometry)
• Rather not tolerate linear scaling to sources (reasoning:
everything else can be mitigated in design by reducing
quality)
• Solution: Dynamically allocate rays to sources
• Psychoacoustic justification: more chaotic sound scene =
less ability to discern individual sounds
Dynamic Ray Allocation
• Dynamic ray allocation:
• First problem: amplitude
changes as rays are reassigned
• Second problem: slower
ray-tracing (warp
divergence between
bounding volumes)
Ray Sorting
• Re-sort rays by assigned source in each block (inspired
by Garanzha & Loop, 2010)
• Advantage: Gain spatial coherence when performing
occlusion testing
• Disadvantage: Ray source positions and directions are
less coherent (remember each source still needs
omnidirectional sampling of rays)
GPU Audio Processing
• Each ray generates audio stream for
mix-down (4096 in example)
• Delayed and filtered due to reflected
materials and HRTF position
• Parallel mix-down (optimal
performance depends on architecture)
Behavior at Reflection
• Typical commercial software uses
per-frequency band simulation
• Reflection (1.0-absorption)
coefficient is a scalar for each
band (may use separate sets of
coefficients for diffuse reflections)
• Approximate multi-band
simulation by representing
reflection behavior as bi-quad
filter (loss of some degrees of
freedom)
GPU Mix-down
• Atomic operations on
Kepler GPUs (easy)
• Round-robin method on
pre-Kepler GPUs (similar
to parallel reduction)
• Only part of the work is
done on GPU (CPU
needed to mix down
shared memory regions)
HRTF and Diffraction
• At low frequencies (below 1KHz) HRTF essentially flat
• These are the frequencies which diffract the most
around architecture (wavelength @ 250Hz ~ 1.3m)
• Precise positioning of diffracted sources is not very
important
• HRTF IIR filter coefficients derived from FIR experiments
(can use Prony’s method, but easier just to match
approximate amplitude and cutoff)
Integrating Artificial Reverb
• Simulation is useful for first 3 or 4 orders (for reference,
on GTX Titan using q3dm1 map, 3 orders simulation
takes ~30% real-time performance)
• 3 orders generates about 500ms of reverb time, but in
practice, T60 for a comparable room can be several
seconds (proportional to volume and inversely
proportional to absorption)
• Additional problem in that simulated reverberation gets
grainier as reverb time gets longer
Integrating Artificial Reverb
• Can estimate artificial reverb
length with mean free path
• Intuitive to apply reverb at end
of DSP chain, but problematic in
some situations
• Resolve by applying reverb at
front of DSP chain (before HRTF
filtering)
Design Considerations…
• On a parametric system, easy to have creative input
• Make the sound more ‘exciting’ or more ‘chaotic’
• We don’t want the sound designer to become level
designer
• Simulation may only be suitable for certain types of
games (realistic, rather than cinematic)
Thanks for Listening!
• For more information, look for dissertation entitled:
Design of a Real-Time GPU Accelerated Acoustic
Simulation Engine for Interactive Applications (University
of Illinois Press, 2014)