SynCoPation: Synthesis Coupled Sound Propagation

Online Submission ID: 0
SynCoPation: Synthesis Coupled Sound Propagation
Figure 1: Our coupled sound synthesis-propagation technique has been integrated in the UnityT M game engine. We plan to demonstrate
the sound effects generated by our system on a variety of scenarios: (a) Cathedral, (b) Tuscany, and (c) Game scene. In the left most case,
the bowl sounds are synthesized and propagated in the Cathedral; In the middle scene, the bell sounds are synthesized and propagated in
outdoor scene; In the last scene sounds of barrel hitting the ground are synthesized and propagated.
1
Abstract
40
41
19
Sounds can augment the sense of presence and immersion of users
and improve their experience in virtual environments. Recent research in sound simulation and rendering has focused either on
sound synthesis or on sound propagation, and many standalone algorithms have been developed for each domain. We present a novel
technique for automatically generating aural content for virtual environments based on an integrated scheme that can perform sound
synthesis as well as sound propagation. Our coupled approach can
generate sounds from rigid-bodies based on the audio modes and
radiation coefficients; and interactively propagate them through the
environment to generate acoustic effects. Our integrated system
allows high degree of dynamism - it can support dynamic sources,
dynamic listeners, and dynamic directivity simultaneously. Furthermore, our approach can be combined with wave-based and geometric sound propagation algorithms to compute environmental effects.
We have integrated our system with the Unity game engine and
show the effectiveness of fully-automatic audio content creation in
complex indoor and outdoor scenes.
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Introduction
Sound simulation algorithms predict the behavior of sound waves,
including generation of sound waves from vibrations and propagation of sound waves in the environment. Realistic sound simulation is important in computer games to increase the level of immersion and realism. Sound augments the visual sense of the player,
provides spatial cues about the environment, and can improve the
overall gaming experience. At a broad level, prior research in sound
simulation can be classified into two parts - synthesis and propagation. The problem of sound synthesis deals with simulating the
physical processes (e.g. vibration of a sound source) involved in
generation of sound. Sound propagation, on the other hand, deals
with the behavior of sound waves as they are emitted by the source,
interact with the environment, and reach the listener.
State-of-the-art techniques for sound simulation deal with the problem of sound synthesis and sound propagation independently.
Sound synthesis techniques model the generation of sound resulting from vibration analysis of the structure of the object [Zheng and
James 2009; Chadwick et al. 2009; Moss et al. 2010; Zheng and
James 2010]. However, in these techniques only sound propagation
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
1
in free-space (empty space) is modeled and the acoustics effects
generated by the environment are mostly ignored. Sound propagation techniques [Krokstad et al. 1968; Allen and Berkley 1979;
Funkhouser et al. 1998; Raghuvanshi et al. 2010; Mehra et al. 2013;
Yeh et al. 2013] model the interaction of sound waves in indoor and
outdoor spaces, but assume pre-recorded or pre-synthesized audio
clips as input. These assumptions can result in missing sound effects and generate inaccurate (or non-plausible) solutions for the
underlying physical reality produced by the process of sound simulation. For example, consider the case of a kitchen bowl falling
from a countertop; the change in the directivity of the bowl with the
hit position and the effect of this time-varying directivity on propagated sound field in the kitchen is mostly ignored by current simulation techniques. Similarly, for a barrel rolling down in an alley, the
sound consists of multiple frequencies, where each frequency has
different radiation and propagation characteristic, which are mostly
ignored by current sound simulation systems. Due to these limitations, artists and game audio-designers have to manually design
sound effects corresponding to these different kinds of scenarios,
which can be very tedious and time-consuming.
In this paper, we present the first coupled synthesis and propagation system which models the entire process of sound simulation
starting from the surface vibration of objects, radiation of sound
waves from these surface vibrations, and interaction of resulting
sound waves with the environment. Our technique models the surface vibration characteristic of an object by performing modal analysis using the finite element method. These surface vibrations are
used as boundary conditions to the Helmholtz equation solver (using boundary element method) to generate outward radiating sound
fields. These radiating sound fields are expressed in a compact basis
using the single-point multipole expansion [Ochmann 1999]. Mathematically, this single-point multipole expansion corresponds to a
single sound source placed inside the object. The sound propagation due to this source is achieved by using numerical sound simulation technique (at low frequencies) and ray-tracing (at high frequencies). We also describe techniques to accelerate ray-tracing
algorithms based on path clustering and binning. Our approach
performs end-to-end sound simulation from first principles and enables automatic sound effect generation for interactive applications,
thereby reducing manual effort and time-spent by artists and gameaudio designers.
The main contributions of our work on coupled sound synthesispropagation include:
Online Submission ID: 0
1. Integrated technique for accurately simulating the effect of
time-varying directivity.
148
93
We plan to integrate our technique with the UnityT M game engine
and demonstrate the effect of coupled sound synthesis-propagation
on a variety of indoor and outdoor scenarios as shown in Fig. 1.
boundary elements. An efficient technique known as the Equivalent source method (ESM) [Fairweather 2003; Kropp and Svensson 1995; Ochmann 1999; Pavic 2006] exploits the uniqueness of
the solutions to the acoustic boundary value problem. ESM expresses the solution field as a linear combination of simple radiating point sources of various orders (monopoles, dipoles, etc.) by
placing these simple sources at variable locations inside the object
and matching the total generated field with the boundary conditions
on the object’s surface, guaranteeing the correctness of solution.
[James et al. 2006] use the equivalent source method to compute
the radiated field generated by a vibrating object.
149
2.3
2
150
94
83
84
138
139
140
2. High accuracy achieved by correct phase computation, and
per-frequency modeling of sound vibration, radiation and
propagation.
85
86
87
3. Interactive runtime to handle high degree of dynamism e.g.
dynamic surface vibrations, dynamic sound radiation, and
sound propagation for dynamic sources and listeners.
88
89
90
91
92
Related Work
96
In this section, we review some of the most closely related work on
sound synthesis, radiation, and propagation techniques.
97
2.1
95
Sound Synthesis
141
142
143
144
145
146
147
151
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
CORDIS-ANIMA was perhaps the first proposed system of
damped springs and masses for modeling surface vibration to synthesize physically-based sounds [Florens and Cadoz 1991]. Numerical integration using a finite element formulation was later
presented as a more accurate technique for modeling vibrations
[Chaigne and Doutaut 1997; O’Brien et al. 2001]. Instead of using
numerical integration, [van den Doel and Pai 1996; van den Doel
and Pai 1998] proposed to compute analytical vibrational modes of
an object, leading to considerable speedups and enabling real-time
sound synthesis.
[van den Doel et al. 2001] introduced the first method to determine the vibration modes and their dependence on the point of impact for a given shape, based on physical measurements. Later,
[O’Brien et al. 2002] presented a general algorithm to determine
modal modes of an arbitrarily-shaped 3D objects by discretizing
them into tetrahedral volume elements. They showed that the corresponding finite element equations can be solved analytically after
suitable approximations. Consequently, they were able to model
arbitrarily shaped objects and simulate realistic sounds for a few
objects at interactive rates [O’Brien et al. 2002]. This approach requires expensive pre-computation called modal analysis. [Raghuvanshi and Lin 2006a] used a simpler system of spring-mass along
with perceptually motivated acceleration techniques to recreate realistic sound effects for hundreds of objects in real time. In this paper, we use a FEM-based method to precompute the modal modes,
similar to [O’Brien et al. 2002].
[Ren et al. 2012] presented an interactive virtual percussion instrument system that used modal synthesis as well as numerical sound
propagation for modeling a small instrument cavity. This work, despite of some obvious similarity, is actually very different from our
coupled approach. Their combination of synthesis and propagation
is not well-coupled or integrated, and the volume of the underlying
acoustic spaces is rather small in comparison to the typical game
scenes (e.g. benchmarks shown in Fig. 1).
2.2
Sound Radiation
The Helmholtz equation is the standard way to model sound radiating from vibrating, rigid bodies. Boundary element method is
a widely used method for acoustic radiation problems [Ciskowski
and Brebbia 1991; von Estorff 2000] but has a major drawback in
terms of high memory requirements, i.e. O(N 2 ) memory for N
Sound is a pressure wave described using the Helmholtz equation
for a domain Ω:
∇2 p +
152
153
154
98
Sound Propagation
155
156
157
158
159
160
161
162
(1)
where p(x) is the complex-valued pressure field, ω is the angular frequency, c is the constant speed of sound in a homogenous
medium, and ∇2 is the Laplacian operator. Boundary conditions
are specified on the boundary of the domain ∂Ω by either using
the Dirichlet boundary condition that specifies the pressure on the
boundary p = f (x) on ∂Ω, the Nuemann boundary condition
= f (x) on ∂Ω,
that specifies the velocity of the medium ∂p(x)
∂n
or a mixed boundary condition that specifies Z ∈ C, so that
Z ∂p(x)
= f (x) on ∂Ω. We also need to specify the behavior of
∂n
p at infinity, which is usually done using the Sommerfeld radiation
condition [Pierce et al. 1981]:
lim [
r→∞
163
ω2
p = 0, x ∈ Ω
c2
ω
∂p
+ i p] = 0
∂r
c
(2)
where r = ||x|| is the distance of point x from the origin.
168
Different methods exist to solve the equation with different formulations. Numerical methods solve for p numerically either by discretizing the entire domain or the boundary. Geometric techniques
model p as a set of rays and propagate these rays through the environment.
169
2.3.1
164
165
166
167
Wave-based Sound Propagation
183
Wave-based or numerical sound propagation solve the acoustic
wave equation using a numerical wave solvers. These methods obtain the exact behavior of a propagating sound wave in
a domain. Numerical wave solvers discretize space and time
to solve the wave equation. Typically techniques include finite difference time domain (FDTD) method [Yee 1966; Taflove
and Hagness 2005; Sakamoto et al. 2006], finite element method
[Thompson 2006] , boundary element method [Cheng and Cheng
2005], pseudo-spectral time domain [Liu 1997], and domaindecomposition [Raghuvanshi et al. 2009]. Wave-based methods
have high accuracy and can simulate wave-effects such as diffraction accurately at low frequencies. However, their memory and
compute requirements grow as the third or fourth power of the frequency, making them impractical for interactive applications.
184
2.3.2
170
171
172
173
174
175
176
177
178
179
180
181
182
185
186
187
188
189
2
Geometric Sound Propagation
Geometric sound propagation techniques use the simplifying assumption that the wavelength of sound is much smaller than features in the scene. As a result, these methods are most accurate
for high frequencies and must model low-frequency effects like
diffraction and scattering as separate phenomena. Commonly used
Online Submission ID: 0
190
191
192
193
194
195
196
197
198
techniques are based on image source methods [Allen and Berkley
1979; Borish 1984] and ray tracing [Krokstad et al. 1968; Vorländer
1989]. Recently, there has been a focus on computing realistic
acoustics in real time using algorithms designed for fast simulation.
These include beam tracing [Funkhouser et al. 1998], frustum tracing [Chandak et al. 2008], and ray-based algorithms [Lentz et al.
2007; Taylor et al. 2012] that compute low-order reflections. In addition, frame-to-frame coherence of the sound field can be utilized
to achieve a significant speedup [Schissler et al. 2014].
212
Edge diffraction effects can be approximated within GA frameworks using methods based on the uniform theory of diffraction (UTD) [Kouyoumjian and Pathak 1974] or the Biot-TolstoyMedwin (BTM) model [Svensson et al. 1999]. These approaches
have been applied to static scenes and low-oder diffraction [Tsingos et al. 2001; Antani et al. 2012a], as well as dynamic scenes
with first-order [Taylor et al. 2012] and higher-order diffraction [Schissler et al. 2014]. Diffuse reflection effects caused by
surface scattering have been previously modeled using the acoustic rendering equation [Siltanen et al. 2007; Antani et al. 2012b],
and radiosity-based methods [Franzoni et al. 2001]. Another
commonly-used technique for ray tracing called vector-based scattering uses scattering coefficients to model diffusion [Christensen
and Koutsouris 2013].
213
3
199
200
201
202
203
204
205
206
207
208
209
210
211
243
244
∂p
= −iωρv on S,
∂n
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
Our Algorithm
263
264
214
215
216
In this section, we give a brief background on various concepts used
in the paper and present our coupled synthesis-propagation algorithm.
265
266
267
268
217
218
219
220
221
222
223
3.1
269
Background
Modal Analysis: Sound is produced by small vibrations of objects.
These vibrations although invisible to the naked eye are audible if
the frequency of vibration lies in the range of human hearing (20 Hz
- 20 kHz). Modal analysis is a well-known technique for modeling
such sounds in rigid-bodies. The small vibrations can be modeled
using a coupled linear system of ODEs:
Kd + Cḋ + Md̈ = f ,
224
225
226
227
228
229
(3)
where K, C, and M are the stiffness, damping, and mass matrices
respectively and f represents the (external) force vector. For small
damping it is possible to approximate C as a combination of mass
and stiffness matrix: C = αM+βK. This facilitates the diagonalization of the above equation, which is represented as a generalized
eigenvalue problem:
KU = ΛMU,
230
231
232
(4)
where Λ is the diagonal eigenvalue matrix, U contains the eigenvectors of K. Solving this eigenvalue problem enables us to write
Eq. 3 as system of decoupled oscillators:
T
q̈ + (αI + βΛ)q̇ + Λq = U f ,
can be achieved using a fast BEM solver and specifying the Neumann Boundary Condition:
where S = ∂Ω (the boundary of the object), ρ is the fluid density,
and v is the surface’s normal velocity given by v = iω(n · û),
where n · û is modal displacement in the normal direction. This
boundary condition links the modal displacements with the pressure
at a point. Unfortunately, BEM is not fast enough for an interactive
runtime necessitating the use of fast, approximate acoustic transfer
functions [James et al. 2006].
In order to approximate the acoustic transfer, we use a source simulation technique called the Equivalent Source Method. We represent a sound source using a collection of point sources (called
equivalent sources) and match the pressure values on the boundary
of the object ∂Ω with the pressure on ∂Ω calculated using BEM.
The main idea here is that if we can match the strengths of the
equivalent sources to match the boundary pressure, we can evaluate
the pressure at any point on Ω using these equivalent sources.
Equivalent sources: The uniqueness of the acoustic boundary value problem guarantees that the solution of the free-space
Helmholtz equation along with the specified boundary conditions
is unique inside Ω. The unique solution p(x) can be found by expressing it as a linear combination of fundamental solutions. One
choice of fundamental solutions is based on equivalent sources. An
Equivalent source q(x, yi ), of the Helmholtz equation subject to
the Sommerfeld radiation condition xi 6= yi is the solution field
induced at any point x due to a point source located at yi , and can
be expressed as:
q(x, yi ) =
L−1
X
l
X
2
cilm ϕilm (x) =
l=0 m=−l
270
271
272
273
274
(6)
L
X
dik ϕik (x),
where k is a generalized index for (l, m) and cilm is its strength.
These fundamental solutions (ϕik ) are chosen to correspond to the
field due to spherical multipole sources of order L (L = 1 being
a monopole, L = 2 a dipole, and so on) located at yi . Spherical
multipoles are given as a product of two functions:
(2)
ϕilm (x) = Γlm hl (ki ri )ψlm (θi , φi ),
275
276
277
278
279
280
281
(7)
k=1
(8)
where (ri , θi , φi ) is the vector (x − yi ) expressed in spherical co(2)
ordinates, hl (ki ri ) is the spherical Hankel function of the second kind, ki is the wavenumber given by ωci , ψlm (θi , φi ) are the
(2)
complex-valued spherical harmonics functions, and Γlm hl is the
normalizing factor for the spherical harmonics. The pressure at any
point in Ω due to M equivalent sources located at {yi }M
i=1 can be
expressed as a linear combination:
p(y) =
(5)
M L−1
X
X m=l
X
cilm ϕilm (y).
(9)
i=1 l=0 m=−l
233
234
235
236
237
238
239
240
where U projects d into the modal subspace q with d = Uq.
Acoustic transfer: The pressure p(x) at any point obtained on
solving Eq. (1) is called the acoustic transfer function. The acoustic transfer function gives the relation between the surface normal
displacements at a surface node and sound pressure at a given field
point. A common method used in acoustics to evaluate these transfer functions through the use of boundary element method (BEM)
discussed before.
285
We have to determine the L2 complex coefficients cilm for each
of the M multipoles. This compact representation of the pressure
p(y) makes it possible to evaluate the pressure at any point of the
domain in an efficient manner.
286
3.2
282
283
284
287
241
242
Since we’re solving Eq. (1) in the frequency domain, we have to
solve the exterior scattering problem for each mode separately. This
288
289
3
Coupled Algorithm
We now discuss our coupled synthesis-propagation algorithm. As
shown in Fig. 2, we start with the modal analysis of the sounding
object which gives the modal displacements, modal frequencies,
Online Submission ID: 0
Figure 2: Overview of our coupled synthesis propagation pipeline. The bowl is used an example of a modal object. The 1st stage comprises
the modal analysis. The figures in red show the first two sounding modes of the bowl. We then form an offset surface around the bowl,
calculate the pressure on this offset surface, place a single multipole at the center of the object, and approximate the BEM evaluated pressure.
In the runtime part of the pipeline, we use the multipole to couple with a propagation system and generate the final sound at the listener.
304
and modal amplitudes. We use these mode shapes as a boundary
condition to BEM to compute the pressure on an offset surface.
Then we place a single equivalent source in the center of the object
and approximate the pressure calculated using BEM. This gives us
a vector of (complex) coefficients of the multipole strengths. At this
stage (the SPME stage in the pipeline), we have computed the representation of an acoustic radiator, which serves as source for the
propagation in the runtime stage of the pipeline using either a geometric or a numeric sound propagator. Our method is agnostic to
the type of sound propagator, but owing to high modal frequencies
generated in our benchmarks, we use a geometric sound propagation system to obtain interactive performance. The final stage of
the pipeline takes the impulse response for each mode, convolves
it with that mode’s amplitude , and sums it to give the final signal.
We describe each stage of our pipeline below:
305
3.2.1
290
291
292
293
294
295
296
297
298
299
300
301
302
303
317
318
319
320
321
322
323
324
325
326
327
328
338
339
For a single-point multipole, Eq. (8) simplifies to:
330
332
333
306
307
308
309
310
311
Given an object, we solve the displacement equation (Eq. 5) to get
a discrete set of mode shapes d̂i , their modal frequencies ωi , and
the amplitudes qi (t). The vibration’s displacement vector is given
by:
d(t) = Uq(t) ≡ [dˆ1 , ..., dˆM ]q(t),
(10)
where M is total number of modes and q(t) ∈ <M is the vector of
modal amplitude coefficients expressed as a bank of sinusoids:
We then use a Single-Point Multipole Expansion to match the pressure values, p̃, on ∂Ω. This is performed by fixing the position of
the multipole and iteratively increasing the order of the multipole
till the error is below a certain threshold . This step has to be repeated for each modal frequency with the order generally increasing
with the modal frequency.
Since we are using a geometric sound propagator in the runtime
stage of our pipeline, using single-point multipole (per mode)
makes it possible to use just one geometric propagation source for
all modes. Theoretically, each multipole should be represented as a
different geometric propagation source, but since all the multipoles
were kept at the same position during BEM pressure evaluation on
the offset surface, we can use just one geometric propagation source
and use the modal frequency (ωi ) as the filter to scale the pressure
at a point. This makes it possible to have an interactive runtime
performance.
329
331
Sound synthesis
to [James et al. 2006], but use a different equivalent source representation. We first compute a manifold and closed offset around the
object. This defines a clear inside to place the multipole source
and also serves as the boundary ∂Ω on which BEM solves the
Helmholtz Equation to obtain pressure p̃ at the N vertices on the
surface.
334
335
336
337
p(y) =
−di t
qi = ai e
sin(2πfi t + θi ),
where fi is the modal frequency (in Hz.), di is the damping coefficient, ai is amplitude, and θi is the initial phase.
314
3.2.2
340
341
342
Sound radiation
343
344
315
316
Once we have the mode shapes and modal frequencies of an object,
we compute the approximate acoustic transfer for the object similar
cilm ϕilm (y).
(12)
l=0 m=−l
(11)
313
312
L−1
X m=l
X
345
346
4
Since no optimal strategies exist for the optimal placement of the
multipole source [Ochmann 1995], we chose the center of our
modal object as the source location. This is in stark contrast to
[James et al. 2006; Mehra et al. 2013] who used a hierarchical
source placement algorithm to minimize the residual error. We
maintain the same error thresholds as them, but simplify the problem by increasing the order L iteratively and checking the pressure
Online Submission ID: 0
347
348
residual ||r||2 < , where r̃ = p̃−Ac with A being an N −by −L2
2
multipole basis matrix, and c ∈ CL is complex coefficient vector.
351
Once we match the BEM pressure on the offset surface for each
mode, we place one spherical sound source for the geometric propagation for all the modes at the same position as our multipoles.
352
3.2.3
349
350
404
405
M
X
407
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
408
409
410
411
Wave-based Propagation: Frequency domain numerical methods
like [Mehra et al. 2013] use the equivalent source method to compute the pressure field on a domain. They decompose the scene into
well-separated objects and compute the per-object and the interobject transfer functions. The per-object transfer function maps the
incoming sound field incident for an object A to the outgoing field
as is defined as:
out
f (Φin
(13)
A ) = TA ΦA ,
412
where Φin
A is the vector of multipoles representing the incident field
on an object, Φout
A is the vector of outgoing multipoles representing
the scattered field, TA is the scattering matrix containing the (complex) coefficients of the outgoing multipole sources. Similarly, the
inter-object transfer function for a pair of objects A and B is defined as:
B
B in
gA
(Φout
(14)
A ) = GA ΦB ,
420
413
414
415
416
417
418
419
421
422
m=−l
cilm ϕilm (y) f or {i}M
1 .
Path Clustering: Although, using a single geometric source reduces
the number of rays considerably, in order to get the acoustic phenomena right we still need a considerable number of rays (≥ 15000)
which makes it too slow for a modal sound source even with a few
sounding modes (M ≥ 20). We solve this problem by clustering the rays based on the angle between the rays and their respective time-delays. We bin the IR (Impulse Response) according to
a user-specified bin size t in seconds (Figure. 3). Then, for each
bin we cluster the rays based on the binning angle ϑ. The binning
algorithm is shown in Algorithm 1.
Delay time
The single-point multipole source can be used to represent the incident field on an object for each modal frequency, which can then be
approximated using the incoming multipoles Φin
A and used in Eq. 7
to get the per-object and the inter-object transfer functions.
r∈R
(16)
This coupling lets us calculate the pressure for a set of ray directions
sampling a sphere uniformly: For a ray direction (θ, φ), traveling
(2)
a distance r, its pressure is scaled by ψ(θ, φ), hl (kr), and α(ωi )
where α(ωi ) is the energy of a ray for a modal frequency ωi . We
use a geometric ray-traced based system to get the paths and their
respective energies.
where GB
A is the interaction matrix and contains the (complex) coefficients for mapping an outgoing field from object A to object B.
A
In general, GB
A 6= GB . For more details on this technique, refer to
[Mehra et al. 2013]
Geometric Propagation: These methods make the assumption that
the wavelength of sound is much greater than the size of features in
the scene and then treat sound as rays, frustums, or beams. Wave
effects like diffraction are modeled separately using geometric approximations. We make use of the ray-based sound propagation
system of [Schissler et al. 2014] to compute paths that sound can
travel through the scene. This system combines path tracing with
a cache of diffuse sound paths to reduce the number of rays required for an interactive simulation. The approach begins by tracing a small number (e.g. 500) of rays uniformly in all directions
from each sound source. These rays strike surfaces and are reflected recursively up to a specified maximum reflection depth (e.g.
50). The reflected rays are computed using vector-based scattering [Christensen and Koutsouris 2013], where the resulting rays are
a linear combination of the specularly reflected rays and random
Lambertian-distributed rays. The listener is modeled as a sphere
the same size as a human head. At each ray-triangle intersection,
the visibility of the listener sphere is sampled by tracing a few additional rays towards the listener. If some fraction of the rays are not
occuluded, a path to the listener is produced. A path contains the
following output data: The total distance the ray traveled r, along
with the attenuation factor α due to reflection and diffraction interactions. Diffracted sound is computed separately using the UTD
diffraction model [Tsingos et al. 2001]. Given the output of the geometric propagation system, we can evaluate the sound pressure as:
X
p(x) =
pr (x),
(15)
PL−1 Pm=l
l=0
pr (x), x ∈ Ω,
r∈R
Sound Intensity
354
Given a single-point multipole source, we can use either a wavebased or a geometric sound propagation scheme to propagate the
source’s radiation pattern into the environment. We describe existing techniques that can be used with our system.
where Ψi =
X
Sound Intensity
353
Ψi (x) = p(x) =
i=1
406
Sound propagation
where pr is the contribution from a ray r in a set of rays R. We
model a multipole Ψi using rays R as :
0
δt
2δt 3δt…
Delay time (binned)
Figure 3: Path Clustering.
Algorithm 1 PathBinning(t, ϑ)
1: maxN umberOf Bins ← ceil(IR.length()/t)
2: bins.setSize(maxN umberOf Bins)
3: for Each ray r ∈ R do
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
423
424
5
. Ray directions are normalized
−
→
Sr ← r.direction()
binIndex ← f loor(r.delay()/t)
bin ← bins[binIndex]
for Each cluster in bin do
. Cluster
−
→directions are normalized
Sc ← cluster.direction()
. check if the angle between the two vectors is less than the
cluster angle ϑ
−
→ −
→
if Sc · Sr > cos(ϑ) then
cluster.add(r)
end if
end for
. If the ray wasn’t compatible with any of the clusters, create a
new one and add the path to it
−
→
newCluster ← cluster(Sr )
bin.add(newCluster)
newCluster.add(r)
end for
Auralization: The last stage of the pipeline is computing the listener response for all the modes. We compute this response by
Online Submission ID: 0
425
426
convolving the time-domain impulse response of each mode with
that mode’s amplitude. The final signal O(t) is:
472
473
474
O(t) =
M
X
475
qi (t) ∗ IRi (t)
(17)
476
i=1
477
428
Where IRi is the impulse response of the ith mode, qi (t) is the
amplitude of the ith mode, and ∗ is the convolution operator.
429
4
427
478
479
480
481
Implementation
482
483
430
431
432
433
434
435
436
In this section, we describe the implementation details of our system. All the runtime code was written in C++ and timed on a 16core, Intel Xeon E5-2687W @ 3.1 GHz desktop with 64 GB of
RAM running Windows 7 64-bit. In the preprocessing stage, the
offset surface generation and eigen decomposition code was written in C++, while the single-point multipole expansion was written
in MATLAB.
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
Preprocessing: We used finite element technique to compute the
stiffness matrix K which takes the tetrahedralized model, Young’s
modulus and the Poisson’s ratio of the sounding object and compute the stiffness matrix for the object. We then do the eigenvalue
decomposition of the system using Intel’s MKL library (DSYEV)
and calculate the modal displacements, frequencies, and amplitudes
in C++. The code to find the multipole strengths was written in
MATLAB, the pressure on the offset surface was calculated using
a fast BEM solver (FastBEM) using FMM-BEM (Fast multipole
method).
Sound Propagation: We use a fast, state-of-the-art geometric ray
tracer [Schissler et al. 2014] to get the paths for our pressure computation. This technique is capable of handling very high orders of
diffuse and specular reflections (e.g. 10 orders of specular reflections and 50 orders of diffuse reflections) and still maintain interactive performance. As mentioned in previous section, we cluster
the rays in order to reduce the number of rays in the scene, but even
with that, the pressure (i.e., the spherical harmonics and the hankel
functions) computation for each ray has to be optimized heavily to
meet the interactive performance requirements.
Spherical Harmonic computation: The number of spherical harmonics computed per ray varies as O(L2 ), making naive evaluation
too slow for an interactive runtime. We used a modified version of
available fast spherical harmonic code [Sloan 2013] to compute
the pressure contribution of each ray. The available code computes
only the real spherical harmonics by making extensive use of SSE
(Streaming SIMD Extension). We find the complex spherical harmonics from the real ones following a simple observation:
499
500
5
485
486
487
488
489
491
492
493
494
495
496
497
498
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
Ylm
1
= √ (Re(Ylm ) + ι Re(Yl−m )) m > 0,
2
(18)
518
519
520
521
1
Ylm = √ (Re(Ylm ) − ι Re(Yl−m ))(−1)m m < 0.
2
522
(19)
523
524
525
465
466
467
468
469
470
471
Using this optimized code gives us a 2-3 orders of magnitude speedup compared to existing spherical harmonic implementations, e.g.,
BOOST
Distance Clustering: Even after the significant speedup achieved in
calculating spherical harmonics, Hankel functions need to be computed for each ray, varying linearly with the order of the multipole.
We solve this problem by clustering the paths, similar to what we
Parallel computation of Mode pressure: Since each mode is independent of the other, the pressure computation for each one of them
can be done in parallel. The lower modes generally require lesser
time to evaluate than the higher ones, so we use a simple, scene dependent, load-balancing scheme to divide the work equally amongst
all the 16 cores. We used OpenMP for the parallelizing on a multicore system.
Real-Time Auralization: The final audio for the simulations is rendered using a streaming partitioned convolution technique. All audio rendering is performed at a sampling rate of 48 kHz. We first
construct an impulse response (IR) for each mode using the computed pressure for the paths returned by the propagation system that
incorporate the effects of the single-point multipole expansion. The
IR is initialized to zero and the pressure for each path is added to
the IR at the sample index corresponding to the delay for that path.
Once constructed, the IRs for all modes are passed to the convolution system for auralization, where they are converted to frequency
domain. During audio rendering, the time-domain input audio for
each mode is converted to frequency domain, then multiplied with
the corresponding IR partition coefficients. The inverse FFT of the
resulting sound is computed and accumulated using overlap-add in
a circular output buffer. The audio device reads from the circular
buffer at the current position and plays back the rendered sound.
484
490
437
did in previous section , based on the distance traveled by them in
the environment. Given a user-defined bin size δd and the length of
the IR t in seconds, we cluster ray distances into NHankel = δtd
bins requiring us to make an order of magnitude less computations.
The Hankel functions are evaluated using BOOST.
526
527
528
529
530
531
532
6
Conclusion
We present the first coupled sound synthesis-propagation algorithm
that can generate realistic sound effects for computer games and
virtual reality. We describe an approach that integrates prior methods for modal sound synthesis, sound radiation, and sound propagation. The radiating sound fields are represented in a compact
basis using a single-point multiple expansion. We perform sound
propagation using this source basis using fast ray-tracing to compute the impulse response and convolve them with the modes to
generate the final sound at the listener. The resulting system has
been integrated and we highlight the performance of many indoor
and outdoor scenes. Overall, this is the first system that successfully
combines these methods and can handle a high degree of dynamism
in term of source radiation and propagation in complex scenes.
Our approach has some limitations. It is limited to rigid objects
and modal sounds. Moreover the time complexity tends to increase
with the mode frequency. Our single-point multipole expansion approach can result in very high order of multipoles. The geometric
sound propagation algorithm may not be able to compute the low
frequency effects (e.g. diffraction) accurately in all environments.
Moreover, the wave-based sound propagation algorithm involves
high pre computation overhead and is limited to static scenes. Currently, we do not perform any sort of mode compression resulting
in a lot of closely spaced modes being generated. We could use the
compression algorithms [Raghuvanshi and Lin 2006b; Langlois
et al. 2014] as a means to reduce the number of modes and thus
reduce the overhead of pressure computation. Our preprocessing
stage takes long time, where most of the time is spent in doing the
eigen-decomposition of the stiffness matrix.
There are many avenues for future work. In addition to overcoming these limitations, we would like to use it in more complex indoor and outdoor environments and generate other sound effects for
complex objects in large environments (e.g. a bell ringing over a
Online Submission ID: 0
533
534
535
536
large, outdoor valley). We would like to explore some approximate
solutions to accelerate the freespace acoustic transfer computation.
It would be useful to include directional sources and also accelerate
the computations using iterative algorithms like Arnoldi [ARP ].
595
596
597
598
599
537
References
600
601
538
539
540
A LLEN , J. B., AND B ERKLEY, D. A. 1979. Image method for efficiently simulating
small-room acoustics. The Journal of the Acoustical Society of America 65, 4
(April), 943–950.
602
603
604
541
542
543
A NTANI , L., C HANDAK , A., TAYLOR , M., AND M ANOCHA , D. 2012. Efficient
finite-edge diffraction using conservative from-region visibility. Applied Acoustics
73, 218–233.
605
606
607
608
546
A NTANI , L., C HANDAK , A., S AVIOJA , L., AND M ANOCHA , D. 2012. Interactive
sound propagation using compact acoustic transfer operators. ACM Trans. Graph.
31, 1 (Feb.), 7:1–7:12.
547
ARPACK. http://www.caam.rice.edu/software/ARPACK/.
611
548
B ORISH , J. 1984. Extension to the image model to arbitrary polyhedra. The Journal
of the Acoustical Society of America 75, 6 (June), 1827–1836.
544
545
549
550
551
552
553
554
C HADWICK , J. N., A N , S. S., AND JAMES , D. L. 2009. Harmonic shells: a practical
nonlinear sound model for near-rigid thin shells. In ACM SIGGRAPH Asia 2009
papers, ACM, New York, NY, USA, SIGGRAPH Asia ’09, 119:1–119:10.
C HAIGNE , A., AND D OUTAUT, V. 1997. Numerical simulations of xylophones. i.
time domain modeling of the vibrating bars. J. Acoust. Soc. Am. 101, 1, 539–557.
609
610
612
613
614
615
616
617
618
619
555
556
557
C HANDAK , A., L AUTERBACH , C., TAYLOR , M., R EN , Z., AND M ANOCHA , D.
2008. Ad-frustum: Adaptive frustum tracing for interactive sound propagation.
IEEE Trans. Visualization and Computer Graphics 14, 6, 1707–1722.
620
621
622
560
C HENG , A., AND C HENG , D. 2005. Heritage and early history of the boundary
element method. Engineering Analysis with Boundary Elements 29, 3 (Mar.), 268–
302.
561
C HRISTENSEN , C., AND KOUTSOURIS , G. 2013. Odeon manual, chapter 6.
562
C ISKOWSKI , R. D., AND B REBBIA , C. A. 1991. Boundary element methods in
acoustics. Computational Mechanics Publications Southampton, Boston.
558
559
623
624
625
626
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
FAIRWEATHER , G. 2003. The method of fundamental solutions for scattering and
radiation problems. Engineering Analysis with Boundary Elements 27, 7 (July),
759–769.
fastBEM making efficient high-fidelity acoustic modeling a reality! http://www.
fastbem.com/fastbemacoustics.html.
F LORENS , J. L., AND C ADOZ , C. 1991. The physical model: modeling and simulating the instrumental universe. In Represenations of Musical Signals, G. D. Poli,
A. Piccialli, and C. Roads, Eds. MIT Press, Cambridge, MA, USA, 227–268.
F RANZONI , L. P., B LISS , D. B., AND ROUSE , J. W. 2001. An acoustic boundary
element method based on energy and intensity variables for prediction of highfrequency broadband sound fields. The Journal of the Acoustical Society of America 110, 3071.
F UNKHOUSER , T., C ARLBOM , I., E LKO , G., P INGALI , G., S ONDHI , M., AND
W EST, J. 1998. A beam tracing approach to acoustic modeling for interactive
virtual environments. In Proc. of ACM SIGGRAPH, 21–32.
JAMES , D. L., BARBI Č , J., AND PAI , D. K. 2006. Precomputed acoustic transfer:
output-sensitive, accurate sound generation for geometrically complex vibration
sources. In ACM SIGGRAPH 2006 Papers, ACM, New York, NY, USA, SIGGRAPH ’06, 987–995.
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
KOUYOUMJIAN , R. G., AND PATHAK , P. H. 1974. A uniform geometrical theory of
diffraction for an edge in a perfectly conducting surface. Proceedings of the IEEE
62, 11, 1448–1461.
648
K ROKSTAD , A., S TROM , S., AND S ORSDAL , S. 1968. Calculating the acoustical
room response by the use of a ray tracing technique. Journal of Sound and Vibration
8, 1 (July), 118–125.
651
K ROPP, W., AND S VENSSON , P. U. 1995. Application of the time domain formulation
of the method of equivalent sources to radiation and scattering problems. Acta
Acustica united with Acustica 81, 6, 528–543.
654
L ANGLOIS , T. R., A N , S. S., J IN , K. K., AND JAMES , D. L. 2014. Eigenmode
compression for modal sound models. ACM Transactions on Graphics (TOG) 33,
4, 40.
657
649
650
652
653
655
656
658
659
7
L ENTZ , T., S CHR ÖDER , D., VORL ÄNDER , M., AND A SSENMACHER , I. 2007.
Virtual reality system with integrated sound field simulation and reproduction.
EURASIP Journal on Advances in Singal Processing 2007 (January), 187–187.
L IU , Q. H. 1997. The PSTD algorithm: A time-domain method combining the pseudospectral technique and perfectly matched layers. The Journal of the Acoustical
Society of America 101, 5, 3182.
M EHRA , R., R AGHUVANSHI , N., A NTANI , L., C HANDAK , A., C URTIS , S., AND
M ANOCHA , D. 2013. Wave-based sound propagation in large open scenes using
an equivalent source formulation. ACM Trans. Graph. (Apr.).
M OSS , W., Y EH , H., H ONG , J.-M., L IN , M. C., AND M ANOCHA , D. 2010. Sounding liquids: Automatic sound synthesis from fluid simulation. ACM Trans. Graph.
29, 3, 1–13.
O’B RIEN , J. F., C OOK , P. R., AND E SSL , G. 2001. Synthesizing sounds from
physically based motion. In SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, ACM Press, New York,
NY, USA, 529–536.
O’B RIEN , J. F., S HEN , C., AND G ATCHALIAN , C. M. 2002. Synthesizing sounds
from rigid-body simulations. In The ACM SIGGRAPH 2002 Symposium on Computer Animation, ACM Press, 175–181.
O CHMANN , M. 1995. The source simulation technique for acoustic radiation problems. Acustica 81, 512–527.
O CHMANN , M. 1999. The full-field equations for acoustic radiation and scattering.
The Journal of the Acoustical Society of America 105, 5, 2574–2584.
PAVIC , G. 2006. A technique for the computation of sound radiation by vibrating
bodies using multipole substitute sources. Acta Acustica united with Acustica 92,
112–126(15).
P IERCE , A. D., ET AL . 1981. Acoustics: an introduction to its physical principles and
applications. McGraw-Hill New York.
R AGHUVANSHI , N., AND L IN , M. C. 2006. Interactive sound synthesis for large scale
environments. In SI3D ’06: Proceedings of the 2006 symposium on Interactive 3D
graphics and games, ACM Press, New York, NY, USA, 101–108.
R AGHUVANSHI , N., AND L IN , M. C. 2006. Interactive sound synthesis for large scale
environments. In Proceedings of the 2006 symposium on Interactive 3D graphics
and games, ACM, 101–108.
R AGHUVANSHI , N., NARAIN , R., AND L IN , M. C. 2009. Efficient and accurate
sound propagation using adaptive rectangular decomposition. Visualization and
Computer Graphics, IEEE Transactions on 15, 5, 789–801.
R AGHUVANSHI , N., S NYDER , J., M EHRA , R., L IN , M. C., AND G OVINDARAJU ,
N. K. 2010. Precomputed Wave Simulation for Real-Time Sound Propagation of
Dynamic Sources in Complex Scenes. SIGGRAPH 2010 29, 3 (July).
R EN , Z., M EHRA , R., C OPOSKY, J., AND L IN , M. C. 2012. Tabletop ensemble:
touch-enabled virtual percussion instruments. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, ACM, 7–14.
S AKAMOTO , S., U SHIYAMA , A., AND NAGATOMO , H. 2006. Numerical analysis of
sound propagation in rooms using the finite difference time domain method. The
Journal of the Acoustical Society of America 120, 5, 3008–3008.
S CHISSLER , C., M EHRA , R., AND D INESH , M. 2014. High-order diffraction and
diffuse reflections for interactive sound propagation in large environments. In Proc.
of ACM SIGGRAPH.
S ILTANEN , S., L OKKI , T., K IMINKI , S., AND S AVIOJA , L. 2007. The room acoustic rendering equation. The Journal of the Acoustical Society of America 122, 3
(September), 1624–1635.
S LOAN , P.-P. 2013. Efficient spherical harmonic evaluation. Journal of Computer
Graphics Techniques (JCGT) 2, 2 (September), 84–83.
S VENSSON , U. P., F RED , R. I., AND VANDERKOOY, J. 1999. An analytic secondary
source model of edge diffraction impulse responses . Acoustical Society of America
Journal 106 (Nov.), 2331–2344.
TAFLOVE , A., AND H AGNESS , S. C. 2005. Computational Electrodynamics: The
Finite-Difference Time-Domain Method, Third Edition, 3 ed. Artech House Publishers, June.
TAYLOR , M., C HANDAK , A., M O , Q., L AUTERBACH , C., S CHISSLER , C., AND
M ANOCHA , D. 2012. Guided multiview ray tracing for fast auralization. IEEE
Transactions on Visualization and Computer Graphics 18, 1797–1810.
T HOMPSON , L. L. 2006. A review of finite-element methods for time-harmonic
acoustics. The Journal of the Acoustical Society of America 119, 3, 1315–1330.
Online Submission ID: 0
662
T SINGOS , N., F UNKHOUSER , T., N GAN , A., AND C ARLBOM , I. 2001. Modeling
acoustics in virtual environments using the uniform theory of diffraction. In Proc.
of ACM SIGGRAPH, 545–552.
663
VAN DEN
660
661
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
D OEL , K., AND PAI , D. K. 1996. Synthesis of shape dependent sounds with
physical modeling. In Proceedings of the International Conference on Auditory
Displays.
D OEL , K., AND PAI , D. K. 1998. The sounds of physical shapes. Presence
7, 4, 382–395.
VAN DEN
D OEL , K., K RY, P. G., AND PAI , D. K. 2001. Foleyautomatic: physicallybased sound effects for interactive simulation and animation. In SIGGRAPH ’01:
Proceedings of the 28th annual conference on Computer graphics and interactive
techniques, ACM Press, New York, NY, USA, 537–544.
VAN DEN
E STORFF , O. 2000. Boundary elements in acoustics: advances and applications,
vol. 9. Wit Pr/Computational Mechanics.
VON
VORL ÄNDER , M. 1989. Simulation of the transient and steady-state sound propagation in rooms using a new combined ray-tracing/image-source algorithm. The
Journal of the Acoustical Society of America 86, 1, 172–178.
Y EE , K. 1966. Numerical solution of initial boundary value problems involving
maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Propagation 14, 3 (May), 302–307.
Y EH , H., M EHRA , R., R EN , Z., A NTANI , L., M ANOCHA , D., AND L IN , M. 2013.
Wave-ray coupling for interactive sound propagation in large complex scenes. ACM
Trans. Graph. 32, 6, 165:1–165:11.
Z HENG , C., AND JAMES , D. L. 2009. Harmonic fluids. ACM Trans. Graph. 28, 3,
1–12.
Z HENG , C., AND JAMES , D. L. 2010. Rigid-body fracture sound with precomputed
soundbanks. In SIGGRAPH ’10: ACM SIGGRAPH 2010 papers, ACM, New York,
NY, USA, 1–13.
8