A Mathematical Framework for Efficient Closed

A Mathematical Framework for Efficient Closed-Form Single Scattering
Vincent Pegoraro
Mathias Schott
Philipp Slusallek
Saarland University & M2CI, Germany
University of Utah & SCI Institute, U.S.A.
Saarland University & DFKI, Germany
A BSTRACT
Analytic approaches to efficiently simulate accurate light transport
in homogeneous participating media have recently received attention in the graphics community, and although a closed-form solution to the single-scattering air-light integral has been derived for
a generic representation of 1-D angular distributions, its high order
of computational complexity alas limits its practical applicability.
In this paper, we introduce alternative algebraic formulations of the
solution that entirely preserve its closed-form nature while effectively reducing its order of complexity. The analytic derivations
yield a significant decrease in the computational cost of the evaluation scheme, and the substantial gains in performance achieved by
the method make high-quality light transport simulation considerably more applicable to both real-time and off-line rendering.
Index Terms: I.3.7 [Computer Graphics]: Three-Dimensional
Graphics and Realism—Color, shading, shadowing, and texture
1
I NTRODUCTION
Due to the wide variety of elements that they model, the ubiquity
of participating media spans a broad range of application areas, not
only in entertainment such as the motion-picture and video-game
industry, but also in automotive and architectural design as well
as in safety-oriented research. There consequently exist considerable needs for methods efficiently simulating accurate light transport within those environments. Regarding single scattering in homogeneous media, a closed-form solution to the air-light integral
has recently been derived for a generic polynomial representation
of azimuthally-symmetric anisotropic phase functions and angular
distributions of punctual light sources. In spite of its mathematical
exactitude, the theoretical result is however restrained in practice by
its high order of complexity with respect to the degree of the input
function, which entails consequential computational costs.
In this paper, we introduce novel alternative formulations of the
closed-form solution to the air-light integral which provide a framework for effectively reducing the computational complexity of the
evaluation scheme from O(N 6 ) to as low as O(N 4 ) or O(N 3 ), depending on the configuration, given an angular distribution of degree N −1, inducing increasingly important speed-ups as the size of
the problem grows, and in the range of 2× to 15× in our practical
experiments. The approach solely builds on identifying mathematical patterns and exploiting algebraic identities rather than relying
on ad-hoc approximations to simplify the expressions. The analytic
derivations therefore entirely preserve both the generality and the
accuracy of the closed-form solution and allow reference images to
be generated with significant gains in efficiency. The substantial decrease in computational cost achieved by the method consequently
not only facilitates the evaluation of the quality of the many approximations to the single-scattering model that have been previously
proposed, but also greatly broadens the applicability of high-fidelity
light transport simulation to real-time as well as off-line rendering.
This document first provides an overview of the related work
in Section 2 as well as some theoretical background on the air-
light integral in Section 3. Our computational-complexity reduction
framework is then introduced in Section 4 followed by a mathematical analysis of the resulting orders of complexity in Section 5 and
implementation details in Section 6. Finally, results evaluating the
performance characteristics of the method are presented in Section
7 before discussing future work orientations in Section 8.
2 R ELATED W ORK
As surveyed by Cerezo et al. [5], rendering participating media
has been the focus of a broad range of approaches in the computer graphics community. Regarding the single-scattering model
of light transport, Max [15] first proposed to evaluate the air-light
integral using adaptive quadrature. The technique was subsequently
followed by numerous ray-marching methods including the work
of Nishita et al. [17], Makarov [14] and Engelhardt and Dachsbacher [8] as well as various volume-slicing approaches introduced
by Dobashi et al. [6, 7] and Imagire et al. [11]. Although simple
and practical, such numerical methods however rely on Riemann
summation inherently prone to under-sampling artifacts.
Aside from the image-based post-processing technique of
Mitchell [16], many analytic approaches, pioneered by the work
of Blinn [4], have alternatively been proposed. The latter was the
precursor to various models for directional lights as presented by
Willis [24], Hoffman and Preetham [10] and Riley et al. [22]. Several approximate methods have also been introduced for punctual
light sources including the series expansion of Lecocq et al. [13]
and its extension to handle volumetric shadows by Biri et al. [3],
the numerical tabulation approach of Sun et al. [23] and its coupling
with ray-marching to determine visibility by Wyman and Ramsey
[25], as well as the RBF-based method of Zhou et al. [26].
Recently, Pegoraro and Parker [18] presented a closed-form solution to the air-light integral in homogeneous media for isotropic
phase functions and punctual light sources, while the extension of
the approach to anisotropic functions relied on a semi-analytic dualformulation [19, 20]. To address the limitations of the latter approximation, Pegoraro et al. [21] then derived a closed-form solution for
general 1-D angular distributions represented as a polynomial of the
cosine angle, which is the native definition of most phase functions
and many light sources. While being mathematically exact, the high
order of complexity of the theoretical result entails consequential
computational costs, alas limiting its practical applicability.
Building on the concepts introduced by Pegoraro et al., this paper
extends their work and proposes alternative algebraic formulations
of the closed-form solution. The analytic derivations entirely preserve the exact nature of the evaluation scheme while effectively reducing its order of complexity, therefore yielding significant gains
in performance. As shadow-volumes-based methods [12, 3, 2] focusing on determining the interval boundaries of volumetric shadows (so as to render light shafts) are orthogonal to our approach
which focuses on evaluating the integral on a given interval, both
techniques may be transparently integrated together in a rendering
system and are consequently not discussed any further therein.
3 A IR -L IGHT I NTEGRAL
~
The radiance L at a given position xa along a ray of direction ω
through a participating medium is defined as the sum of the medium
radiance Lm which accounts for the contribution of the medium itself, and the reduced radiance Lr accounting for the contribution of
(~p −~p )+x ~v
where dc = del xhh+dlel =~vh ·~vl with ~vh = e hl h e being the unitlength projection vector of the light source onto the ray given xa at
the origin, and H = κt h the optical distance between them, while
the bounds of integration read va = v(xa ) and vb = v(xb ).
Assuming an arbitrary phase function defined by an NΦ -term
polynomial of the cosine angle µ = cos(θ ) such that
pl
ϑ
vl
vh
d
Φc (µ ) =
NΦ −1
∑
h
pe ve
θ
xa
x
cΦ (n)µ n
(7)
n=0
and similarly formulating the angular distribution of light sources
as a polynomial of degree NI − 1
xb
xh
Ic (ς ) =
NI −1
∑
cI (n)ς n
(8)
n=0
where ς = cos(ϑ ), the medium radiance may be expressed as [21]
κs κt (xa −xh )
2·
e
h
Z
N−1
2n
vb
e−Hv
∑ c(n) ∑ d(n, k) va (v2 + 1)n+1 vk dv.
n=0
k=0
~)=
Lm (xa , xb , ω
Figure 1: Diagram illustrating the terms involved in the computation
of the air-light integral.
objects in the background at position xb
~ ) + Lr (xa , xb , ω
~ ).
~ ) = Lm (xa , xb , ω
L(xa , ω
(1)
Considering a homogeneous absorption coefficient κa and scattering coefficient κs such that the extinction coefficient κt = κa + κs
is constant, the reduced radiance is then defined as
−κt (xb −xa )
~)=e
Lr (xa , xb , ω
~ ).
L(xb , ω
(2)
Given a point light source of intensity I located at position ~pl ,
parameterized by the angle ϑ with the unit direction ~vl , and illuminating a non-emissive medium of phase function Φ, the medium
radiance for a view ray of origin ~pe and direction ~ve reads [17]
Z xb
e−κt d(x,~ω ) I ϑ (x, ω
~)
κ
−
(x−x
)
t
a
~)
~)= e
dx
κs Φ θ (x, ω
Lm (xa , xb , ω
~)
d 2 (x, ω
xa
(3)
p
~ ) = h2 + (x − xh )2 is the distance from the light to a
where d(x, ω
point x along the ray, with h being the distance from the light to the
ray and xh its projection coordinate onto it as illustrated in Figure
1. Expanding the terms then readily gives [19]
√
Z xb −κt x+ h2 +(x−xh )2
e
~ ) = κs eκt xa
·
(4)
Lm (xa , xb , ω
h2 + (x − xh )2
xa
!!
π
d x + dlel
x − xh
dx
+
I arccos p el
Φ arctan
h
2
h2 + (x − xh )2
where the light parameters read del =~ve ·~vl and dlel = (~pe −~pl ) ·~vl.
The substitutions Φ(θ ) = Φc cos(θ ) and I(ϑ ) = Ic cos(ϑ )
along with the change of variable
s
x − xh 2
x − xh
v(x) =
+ 1+
(5)
h
h
then allow the integrand to be written as a function of only 4 parameters [19] and the medium radiance becomes
vb e−Hv
κs κt (xa −xh )
e
·
2
h
va v2 + 1
2
del (v2 − 1) + 2dc v
v −1
Ic
dv
Φc − 2
v +1
v2 + 1
~)=
Lm (xa , xb , ω
Z
(6)
(9)
Whenever the light source is isotropic, the coefficients are
defined as N = NΦ , c(n) = cΦ (n) and d(n, k) = 1 − (k mod
2) dΦ (n, k) with
k
n
dΦ (n, k) = (−1) 2 k
(10)
2
while if the phase function is isotropic, they read instead N = NI ,
c(n) = cI (n) and d(n, k) = dI (n, k) with
n−|n−k|
∑
dI (n, k) =
(−1)n−
k+l
2
n−l
(2dc )l del
l=k mod 2
l+=2
n n−l
k−l
l
2
(11)
where l+ = 2 indicates increments of two in the iterator. Also, if
both are anisotropic, then N = NΦ + NI − 1, c(n) = 1 and
d(n, k)
min{NΦ −1,n}
∑
=
m=max{0,n−NI +1}
cΦ (m)cI (n − m) ·
min{2m,2⌊ 2k ⌋}
∑
l=max{0,2⌈ 2k ⌉−2(n−m)}
l+=2
(12)
dΦ (m, l)dI (n − m, k − l).
The integral in Equation 9 may then be solved in closed form
[21] using the antiderivative reported in Equation 13 (see separate
figure) where n ∈ N and m ∈ N∗ while the E function reads
1 ıj
eıa
E(a, v, j) =
Ei(av
+
ıa)
+
Ei(av
−
ıa)
2 eıa
ıj
=
j
(−1)⌊ 2 ⌋ i(1−( j mod 2)) (a, v)
(14)
with the imaginary unit ı2 = −1 while the real and imaginary parts
of the complex-valued exponential integral function Ei [1] define
i0 (a, v) = sin(a)ℜ Ei(av + ıa) − cos(a)ℑ Ei(av + ıa) (15)
i1 (a, v) = cos(a)ℜ Ei(av + ıa) + sin(a)ℑ Ei(av + ıa) . (16)
Akin to the original method [21], this work focuses on the evaluation of the medium radiance, and surface shading calculation is
Z
1 m−1 1 m − 1 + l
eav
n
v dv = m−1 ∑ l
m−1
(v2 + 1)m
2
l=0 2
− eav
eav
+
a
≤n−m+l ∑
k=0
m−1−l−k
∑
j=1
min{n,m−1−l} ∑
k=0
am−1−l−k− j
( j − 1)!
(m − 1 − l − k)! (v2 + 1) j
e−κt d(xb ,~ω ) Ic (−~
ωi ·~vl )
~)
d 2 (xb , ω
∑
≤j
4.1 Complexity Reduction of the Medium Radiance
The expression of the medium radiance as given in Equation 9 directly follows from the formulation of the product of the phase
function and light distribution derived by Pegoraro et al. [21]
Φc
=
v2 − 1
del (v2 − 1) + 2dc v
− 2
Ic
v +1
v2 + 1
N−1
2n
n=0
k=0
vk
∑ c(n) ∑ d(n, k) (v2 + 1)n .
(18)
In order to alter the asymptotic cost of the formulation, each
summand in Equation 18 may be divided by (v2 + 1)N−1−n such
that all terms have a common denominator. Expanding the corresponding multiplier in the numerator using the binomial theorem,
the above product is rewritten as
2
v −1
del (v2 − 1) + 2dc v
Φc − 2
Ic
v +1
v2 + 1
=
(19)
N−1
2n
N−1−n 1
N − 1 − n 2i
k
c(n)
d(n,
k)v
v .
∑
∑
∑
i
(v2 + 1)N−1 n=0
i=0
k=0
m−n−l+k− j+i
2
∑
(−1)
!
j i
v
i
−m+l+k− j+i
2
i=(−m+l+k− j) mod 2
i+=2
!
j i
v
i
Substituting j = k + 2i and rearranging the terms then yields
2
del (v2 − 1) + 2dc v
v −1
Ic
Φc − 2
v +1
v2 + 1
2(N−1)
1
∑ b( j)v j
(v2 + 1)N−1 j=0
=
(20)
where we define
b( j) =
N−1
min{2n, j}
n=0
k=max{ j mod 2, j−2(N−1−n)}
k+=2
∑ c(n)
∑
N −1−n
. (21)
d(n, k)
j−k
2
Substituting Equation 20 into Equation 6, the medium radiance
finally becomes
C OMPLEXITY R EDUCTION F RAMEWORK
This section presents our computational-complexity reduction
framework for the closed-form solution to the single-scattering airlight integral. The expression of the medium radiance is first analyzed and an alternative formulation introduced. So as to decrease
the order of complexity of the antiderivative, a compact formulation of the latter is subsequently provided before explaining how
the reuse of recurrent mathematical entities can be exploited to further improve the asymptotic behavior of the solution.
(−1)
(13)
i=(m−n−l+k− j) mod 2
i+=2
(17)
~ i is a unit-length vector
where ~nb is the surface normal at xb and ω
directed towards the light source.
Due to the O(N 4 )-complexity of the antiderivative introduced by
Pegoraro et al. [21] for an N-term input angular distribution, evaluating the medium radiance as given in Equation 9 exhibits an overall
cost of the order of O(N 6 ). In the subsequent sections, an analysis
of how the asymptotic behavior of the solution can be notably altered is carried out so as to substantially decrease its computational
cost while preserving the closed-form nature of the approach.
4
≤j
1
n n−m+l−k (n − m + l − k)!
∑
n−m+l−k−
j
j!
k
(−a)
j=0
consequently assumed to be dominated by extinction phenomena.
Given a bidirectional reflectance distribution function (BRDF) β ,
the background radiance may then be computed as follows
~ i ·~nb
~ i ,~nb ) ω
~ ) = β (~
ω, ω
L(xb , ω
n
am−1−l−k
E(a, v, m − n − l + k)
k
(m − 1 − l − k)!
~)=
Lm (xa , xb , ω
κs κt (xa −xh ) 2(N−1)
2 ∑ b( j)
e
h
j=0
Z vb
va
e−Hv
(v2 + 1)N
v j dv.
(22)
In essence, the above result increases the depth of the computation of the d(n, k) coefficients by 1 while decreasing that of the computation of the integral by 1 relative to Equation 9. It follows that
the overall complexity of the evaluation scheme will be reduced (respectively increased) whenever the order of complexity of calculating the d(n, k) coefficients is strictly smaller (respectively greater)
than that of the integral minus 1 as detailed in Section 5. While the
theoretical complexity is unchanged if both terms are equal, gains
in performance might still be obtained as illustrated in Section 7.
4.2 Complexity Reduction of the Antiderivative
The formulation of the antiderivative from Equation 13 is now analyzed. Defining the following function for algebraic convenience
j
1 j 1
I( j) =
(23)
ı + j = (−1)⌊ 2 ⌋ 1 − ( j mod 2)
2
ı
and expanding the various levels of nested sums, the solution may
be reformulated as
Z
eav
(v2 + 1)m
vn dv =
1
F1 (a, v, m, n) +
2m−1
eav F2 (a, v, m, n) + eav F3 (a, v, m, n)
(24)
with F1 , F2 and F3 given in Equations 25, 26 and 27 respectively.
As the theoretical complexity of the antiderivative evaluation is
dominated by F2 and F3 , reducing the order of F1 will not directly
impact the former but will certainly affect the practical efficiency of
the computation. To this end, we start by permuting the inner and
outer loops, and rearrange the terms to formulate F1 as in Equation
am−1−l−k
1 m − 1 + l min{n,m−1−l} n
F1 (a, v, m, n) = ∑ l
E(a, v, m − n − l + k)
∑
m−1
k (m − 1 − l − k)!
k=0
l=0 2
m−1
j i
1 m − 1 + l min{n,m−1−l} n m−1−l−k −( j − 1)! am−1−l−k− j j
I(m
−
n
−
l
+
k
−
j
+
i)
v
F2 (a, v, m, n) = ∑ l
∑
∑ (m − 1 − l − k)! (v2 + 1) j ∑
i
m
−
1
k
2
i=0
j=1
k=0
l=0
j
m−1
1 m − 1 + l ≤n−m+l n n−m+l−k (n − m + l − k)!
a−1
j i
F3 (a, v, m, n) = ∑ l
I(−m
+
l
+
k
−
j
+
i)
v
∑ k ∑
n−m+l−k− j ∑
j!
m
−
1
i
2
(−a)
i=0
j=0
l=0
k=0
m−1
n−1
F3 (a, v, m, n) =
1
n−1−l
∑ al+1 ∑
k=0
l=0
m−1
1 m − 1 + j n−m−l−k+ j n n − m + j − i
(l + k)! k
I(−n − l + k + 2i)
v
∑
∑
l +k
i
k!
2j
m−1
i=0
j=max{0,m−n+l+k}
28. Exploiting the identity E(a, v, j ± 2k) = (−1)k E(a, v, j), it is
then possible to express F1 in terms of the following coefficients
l′
C(m′ , n′ , l ′ )
=
=
′ 1 i m′ + i
n
−
(31)
′
′ −i
2
m
l
i=max{0,l ′ −n′ }
∑
min{l ′ ,n′ } ∑
j=0
−
1
2
l ′ − j ′ ′
m + l − j n′
.
m′
j
While the order of complexity of F1 as given in Equation 32 remains
O(N 2 ), Section 4.3 will discuss how this new formulation may be
exploited to ultimately reduce the former order.
Repeatedly permuting the inner and outer loops in a similar fashion, F2 may then be expressed as in Equation 29 where we use the
short-hand notation α = 2m − n − 1 − l − 2k. Making the two inner
loops independent by exploiting the identity I( j ± 2k) = (−1)k I( j),
it follows that F2 may ultimately also be expressed in terms of the C
coefficients as shown in Equation 33 which effectively reduces its
order of complexity from O(N 4 ) to O(N 3 ).
Applying such loop permutation to F3 as well and rearranging
the terms, the latter may be formulated as in Equation 30. Here
again exploiting the fact that I( j ± 2k) = (−1)k I( j) and the binomial coefficient identity which, for ∀I ≥ 0 and ∀N ≥ 1, gives [9]
K
∑ (−1) j
j=0
N +I
j
(26)
(27)
m−1−l
m−1 l
a
1 m−1+k
n
E(a, v, 2m − n − 1 − l − 2k)
∑
∑
k
m−1
m−1−l −k
l=0 l! k=max{0,m−1−l−n} 2
k m−2 m−1−l
m−1−l−k
−1 (k − 1)!
1 m−1+ j
k
n
F2 (a, v, m, n) = ∑ al ∑
∑
∑ i vi I(α − 2 j + i)
j
2 + 1)k (l + k)!
2
m
−
1
m
−
1
−
l
−
k
−
j
(v
i=0
k=1
l=0
j=max{0,m−1−l−k−n}
F1 (a, v, m, n) =
(25)
K +I − j
I
K
=
=
N
(35)
j
j=0
N −1
(−1)K
K
∑ (−1) j
the inner loop may then be collapsed and merged with the second
inner-most loop so as to also express F3 in terms of the C coefficients. This leads to the formulation given in Equation 34 where
M = 1 and which effectively reduces the order of complexity of F3
from O(N 4 ) to O(N 3 ).
In addition, further analysis of the properties of the above coefficients reveals that C(m′ , n′ , l ′ ) = 0 whenever l ′ = m′ , n′ is
odd and n′ < 2m′ . Examination of the parity of the iterators
shows that the first 2 conditions hold in the formulation of F3 as
given in Equation 34 from which follows that C 6= 0 only when
(28)
(29)
(30)
n − 1 − l − k ≥ 2(m − 1). Therefore, the inner loop only needs to iterate so long as k ≤ min{n − 1 − l, n − 2m + 1 − l} = n − 2m + 1 − l
since m ≥ 1. Moreover, because k has the same parity as n − l,
k can never equal n − 2m + 1 − l and the upper bound may consequently be replaced by n − 2m − l. Given that k ≥ 0, the inner
loop can consequently only be entered if n − 2m − l ≥ 0, from
which follows that the outer loop only needs to iterate so long as
l ≤ min{n − 1, n − 2m} = n − 2m since m ≥ 1. Overall, the formulation of F3 as given in Equation 34 may then be accordingly
optimized by redefining M = 2m. Analyzing the value of the iterators in Equation 9 or 22, it can finally be inferred that the resulting
parameters in Equation 34 always satisfy n − 2m ≤ −2 in the context of the specific problem of concern, therefore inducing the loops
therein to never be entered such that F3 systematically evaluates to
zero when solving the air-light integral.
Overall, a more compact formulation of the antiderivative was
ultimately obtained by exploiting algebraic identities and extracting the underlying recurrent mathematical terms contained therein.
The new expressions result in a reduction of the depth of the nested
loops, effectively decreasing the order of complexity of the evaluation of the antiderivative from O(N 4 ) to O(N 3 ).
4.3 Antiderivative Evaluation with Coefficients Reuse
In order to further decrease the order of complexity of the antiderivative, we note that a manipulation similar to the one introduced in Section 4.1 can also be applied to Equation 33. More
specifically, dividing each summand by (v2 + 1)m−2−l−k allows the
exponents of both v2 + 1 and a to solely depend on a common iterator. Expanding the corresponding multiplier in the numerator using
the binomial theorem and rearranging the terms, the formulation of
F2 may then be rewritten as
m−2
F2 (a, v, m, n) =
∑
al
2(m−2−l)+1
∑
D(m, n, l, k)vk (36)
2
m−1−l
k=n+1+l mod 2
l=0 (v + 1)
k+=2
where the D(m, n, l, k) coefficients are given in Equation 37.
As both the C and D coefficients are entirely defined by their
integer indices whose range is solely determined by the maximum
allowable degree of the angular distribution, they are intrinsically
independent of the actual physical parameters of the problem and
m−1 l
a
∑
F1 (a, v, m, n) =
l=0
l!
C(m − 1, n, m − 1 − l)E(a, v, 2m − n − 1 − l)
m−2
m−2−l
l=0
k=0
∑ al ∑
F2 (a, v, m, n) =
(32)
k+1
2m−n−1−l−2k+ j
k!
1
j k+1
2
C(m
−
1,
n,
m
−
2
−
l
−
k)
(−1)
v
∑
j
(v2 + 1)k+1 (l + 1 + k)!
j=n+1+l mod 2
(33)
j+=2
n−M
∑
F3 (a, v, m, n) =
l=0
n−M−l
n−2m+l−k (l + k)!
1
C(m − 1, n − 1 − l − k, m − 1)
∑ vk (−1) 2
k!
al+1 k=(n+l) mod 2
(34)
k+=2
min{m−2−l,2(m−2−l)+1−k}
∑
D(m, n, l, k) =
j=0
j!
C(m−1, n, m−2−l − j)
(l + 1 + j)!
their computation may consequently be decoupled from that of the
air-light integral itself. It follows that both can be a priori tabulated once only for a given range of indices and then reused multiple times while conducting a series of integral evaluations with
various physical configurations. This allows the coefficients to be
accessed in constant time during rendering, in turn reducing the order of complexity of computing F1 as given in Equation 32 from
O(N 2 ) to O(N), that of F2 from O(N 3 ) in Equation 33 to O(N 2 ) in
Equation 36, and that of F3 in Equation 34 from O(N 3 ) to O(N 2 ).
Overall, tabulating the coefficients results in yet another reduction
of the depth of the nested loops involved in the formulation of the
antiderivative, effectively providing a means of decreasing the computational complexity of its evaluation from O(N 3 ) to O(N 2 ).
It is also crucial to emphasize that tabulating the coefficients
does not introduce any approximation and consequently entirely
preserves the closed-form nature of the solution. Indeed, the actual results of the precomputation are simply read back during the
integral evaluation process based on their integer indices without
involving interpolation or extrapolation of any kind.
For an N-term polynomial representation of the product of the
phase function and light distribution, the indices of the D(m, n, l, k)
coefficients are constrained within
1≤
0≤
0≤
0≤
≤N
≤ 2(N − 1)
≤ N −2
≤ 2(N − 2) + 1.
m
n
l
k
(38)
With respect to the C(m′ , n′ , l ′ ) coefficients, analyzing the behavior of their indices yields the following bounds
0≤
0≤
0≤
m′
n′
l′
≤ N −1
≤ 2(N − 1)
≤ N − 1.
(39)
Noting that C(m′ , n′ , l ′ ) = 1 whenever l ′ = 0 for which explicitly
precomputing the result is unnecessary, the table of C(m′ , n′ , l ′ ) coefficients may therefore be appended as an additional layer in the
k dimension to the table of D(m, n, l, k) coefficients. It follows that
for up to N-term polynomials, the overall size of the table is given
as N × (2N − 1) × (N − 1) × (2N − 1). Although precomputing and
storing a 4-dimensional table might seem prohibitive, this is actually not a concern in practice since very low resolutions typically
accommodate most applications as illustrated in Section 7.
min{m−2−l− j,⌊ 2k ⌋} ∑
i=max{0,⌈ k−2j−1 ⌉}
m−2−l − j
i
2m−n−1−l+k−2 j−2i
j+1
⌋
2
(−1)⌊
k − 2i
(37)
5 A NALYSIS OF THE C OMPUTATIONAL C OMPLEXITY
To provide insightful characteristics of the computational cost of
the different formulations, Table 1 summarizes the order of complexity of each individual calculation scheme assuming that the various factorial and binomial terms are computed via incremental update. As anticipated, the overall order of complexity of the schemes
is always monotonically decreasing (not necessarily strictly) when
evaluating the indefinite integral using the original formulation
from Pegoraro et al. [21], the reduced formulation from Section 4.2,
and the formulation with coefficients reuse from Section 4.3 respectively. From a theoretical perspective, the asymptotic behavior of
the integration schemes listed in that order consequently improves
in a consistent manner. With respect to anisotropic phase functions or light sources (top 2 numbers for each of the 6 schemes),
the reduced medium radiance formulation from Section 4.1 also
contributes to improving the asymptotic behavior over the original
formulation [21]. When both distributions are anisotropic though
(bottom number for each scheme), only then the original formulation of the medium radiance exhibits more desirable characteristics.
Table 1: Orders of complexity of the various computational schemes
for an N-term polynomial angular distribution. The results are reported when evaluating the antiderivative using (from left to right) the
original formulation from Pegoraro et al. [21] (based on Equation 13),
the reduced formulation from Section 4.2 (based on Equations 24,
32 and 33), and the scheme with coefficients reuse from Section 4.3
(based on Equations 24, 32 and 36); while computing the medium
radiance using (from top to bottom) the original formulation from Pegoraro et al. [21] (based on Equation 9) and the reduced formulation
from Section 4.1 (based on Equation 22). For each of the 6 individual schemes (shown in the 6 colored cells), the orders are reported
for (from top to bottom) an anisotropic phase function only (based on
Equation 10), an anisotropic light distribution only (based on Equation 11), and the product of an anisotropic phase function and an
anisotropic light distribution (based on Equation 12).
Lm \
R
Ori. [21]
Red. (4.1)
Ori. [21]
O(N 6 )
O(N 6 )
O(N 6 )
O(N 5 )
O(N 5 )
O(N 6 )
Red. (4.2)
O(N 5 )
O(N 5 )
O(N 5 )
O(N 4 )
O(N 4 )
O(N 6 )
C-r (4.3)
O(N 4 )
O(N 4 )
O(N 5 )
O(N 3 )
O(N 4 )
O(N 6 )
Φc
Ani.
Iso.
Ani.
Ani.
Iso.
Ani.
Ic
Iso.
Ani.
Ani.
Iso.
Ani.
Ani.
AirLightIntegral()
1. Compute Lm for the given cΦ (n) and cI (n) coefficients;
2.
Compute bounds of integral as in Equation 5;
3.
Compute Lm as in Equation 9 or 22;
4.
Compute d(n, k) coefficient as in Equation 10/11/12;
5.
Compute integral using Equations 24, 32 and 33|36;
6. Compute Lr as in Equations 2 and 17;
7. Compute L as in Equation 1;
5
0
10
L : Ori. [21] − ∫: Ori. [21] ⇒ 5.79
m
L : Ori. [21] − ∫: Red. (42) ⇒ 4.74
m
L : Ori. [21] − ∫: C−r. (43) ⇒ 3.85
m
L : Red. (41) − ∫: Ori. [21] ⇒ 4.84
m
L : Red. (41) − ∫: Red. (42) ⇒ 3.80
m
L : Red. (41) − ∫: C−r. (43) ⇒ 2.88
5
log (computationalTime)
10
m
0
L : Ori. [21] − ∫: Ori. [21] ⇒ 5.79
m
L : Ori. [21] − ∫: Red. (42) ⇒ 4.74
m
L : Ori. [21] − ∫: C−r. (43) ⇒ 3.85
m
L : Red. (41) − ∫: Ori. [21] ⇒ 4.84
m
L : Red. (41) − ∫: Red. (42) ⇒ 3.79
m
L : Red. (41) − ∫: C−r. (43) ⇒ 3.40
m
−5
2
−5
2
log (computationalTime)
6 I MPLEMENTATION
Despite the cumbersomeness of the intermediate mathematical
steps involved in the derivation, the expressions ultimately resulting from the analysis are actually simpler than those of the initial
solution by Pegoraro et al. [21]. This eases the process of implementing the method which really only entails literally converting
the few relevant equations into code, as identified in the algorithm
outline provided in Figure 2. To facilitate portability onto graphics
hardware, we used a platform-independent implementation of the
exponential integral [19] in our experiments while the Boost C++
libraries may alternatively be used on the CPU.
Regarding efficiency, the observations made about the original
solution [21] similarly hold and redundant computation may be
easily avoided by noting that the terms i0 and i1 in Equation 14
are independent of the value of the iterators from Equation 32 and
may consequently be evaluated once only before entering the loops.
Likewise, all the power and factorial terms as well as the binomial
coefficients involved can be efficiently computed via pre-iterative
initialization and constant-time incremental update throughout the
iterations. To this end, applying basic manipulations of the type
al
1
m l l
∑m
l bm−l = bm ∑l a b to Equation 36 then allows the external factor
to be ultimately generated as a by-product of the iterative process.
−10
−15
0
−10
1
2
3
4
5
log2(N = polynomialDegree + 1)
6
7
−15
0
1
2
3
4
5
log2(N = polynomialDegree + 1)
6
7
Figure 3: Computational time required to evaluate the air-light integral against the number of terms N in the input polynomial angular
distribution of degree N − 1, plotted on a logarithmic scale with base
2. The results are shown for an anisotropic phase function (left) and
an anisotropic light source (right) for each of the 6 individual evaluation schemes introduced in Table 1 (see color legend). A difference
of 1 unit along the ordinates corresponds to a factor of 2 in computational cost, and the slope of each curve defines in the limit, i.e. on
the far right, the order of complexity of the corresponding scheme.
single-precision floating-point representation and are precomputed
in less than 70 milliseconds on a single CPU core. Due to the low
computational cost entailed by the predetermined number of iterations involved, it is actually possible to carry out the precomputation step on the fly either upon starting the application or dynamically whenever the user changes the degree of the angular representation, as an alternative to statically tabulating and storing the coefficients on disk for future reuse. In the GPU-based implementation,
the resulting look-up table is loaded into a texture buffer indexed in
the fragment shader by integer coordinates with filtering turned off.
7.1 Low-Degree Angular Distributions on the GPU
Figure 2: Algorithm outline showing the equations involved in evaluating the air-light integral using our complexity reduction framework.
7 R ESULTS
As Table 1 suggests, the benefits of the alternative formulations introduced in this paper are more prominent when either the phase
function or the light distribution is anisotropic, while a notable but
less substantial decrease in computational complexity may still be
observed when both are anisotropic. In this regard, the analysis to
follow consequently focuses on the former cases.
In order to validate the theoretical orders of complexity reported
in the latter table, measurements of the computational time when
increasing the degree of a general polynomial anisotropic distribution with exclusively non-zero coefficients were carried out while
arbitrarily setting the remaining physical parameters as they do not
impact the asymptotic behavior of the evaluation schemes. The results for an anisotropic phase function and an anisotropic light distribution are respectively reported in Figure 3 which additionally
illustrates the relative performance of the various schemes.
To evaluate the practical performance characteristics of the
computational-complexity reduction framework, the method was
additionally implemented both in an off-line CPU-based rendering
system as well as in a GPU-based fragment shader using OpenGL
and Cg. All measurements were done on a Windows XP 64-bit machine with a 2.66 GHz Intel Core 2 Quad CPU and an NVIDIA
GeForce GTX 280. In all cases, the air-light integral was evaluated
independently for each color channel, i.e. 3 times per fragment.
The table of coefficients was sized to accommodate up to 16term polynomials which typically overcomes the needs of most applications. The 230640 values barely amount to 0.88 MB using a
Figure 4: The Sibenik cathedral filled with anisotropic dust modeled with a forward-scattering Eddington phase function (left) and a
Rayleigh phase function (right). The results were computed in closed
form and rendered in real-time on current-generation graphics hardware using our new formulations of the solution.
Figure 4 illustrates the results obtained with our GPU implementation for various anisotropic functions. These reference images
were rendered in closed form at a fraction of the cost of the original formulation of the solution using our computational-complexity
reduction framework which yields optimal overall speed-ups of
3.28×, 2.34×, 4.51× and 3.66× for an Eddington phase function
(also called linear anisotropic), a Rayleigh phase function, and a
spotlight and light ball both of degree 4, respectively.
A break-down of the contributions of each individual component
of the complexity reduction framework is provided in Table 2 which
reports the frame rates obtained on the GPU for the air-light calculation only, with the 6 computational schemes described in Section 5. Regarding the antiderivative, the reduced formulation pre-
Table 2: Break-down of the performance characteristics of the computational schemes for low-degree angular distributions, measured
in frame rates (in frames per second) on the GPU at a resolution of
768×768. For each of the 6 individual schemes introduced in Table 1
(shown in the 6 colored cells), the results are reported for an Eddington phase function (degree 1), a Rayleigh phase function (degree 2),
and a spotlight and light ball both of degree 4, from top to bottom
respectively. The overall speed-up achieved by the optimal computational scheme (highlighted) compared to the original formulation (in
red) is 3.28×, 2.34×, 4.51× and 3.66× for each anisotropic distribution respectively, while the intermediate speed-ups contributed by
the individual formulations are indicated between the corresponding
cells (progressively from left to right, and from top to bottom).
−−−−−−→
−−−−−−→
Lm \
Ori. [21] Speed-up Red. (4.2) Speed-up C-r (4.3)
−−−−−−→
−−−−−−→
102 FPS → 3.25× → 332 FPS → 0.98× → 326 FPS
58.1 FPS → 1.24× → 72.3 FPS → 1.47× → 106 FPS
Ori. [21]
5.43 FPS → 2.36× → 12.8 FPS → 1.91× → 24.5 FPS
4.37 FPS → 1.40× → 6.11 FPS → 1.85× → 11.3 FPS
↓ 1.01× ↓ Eddington ↓ 1.01× ↓ Eddington ↓ 1.01× ↓
↓ 1.34× ↓ Rayleigh ↓ 1.35× ↓ Rayleigh ↓ 1.28× ↓
↓ 0.96× ↓ Spotlight4 ↓ 0.95× ↓ Spotlight4 ↓ 0.94× ↓
↓ 1.01× ↓ Light ball4 ↓ 1.57× ↓ Light ball4 ↓ 1.42× ↓
103 FPS → 3.25× → 335 FPS → 0.98× → 328 FPS
77.8 FPS → 1.25× → 97.4 FPS → 1.40× → 136 FPS
Red. (4.1)
5.22 FPS → 2.34× → 12.2 FPS → 1.89× → 23.0 FPS
4.42 FPS → 2.17× → 9.57 FPS → 1.67× → 16.0 FPS
Figure 5: A typical science-fiction scenario depicting a city drawn in
haze illuminated by the focused spotlights of an alien drone. This
reference image was rendered in closed form at a fraction of the cost
of the original solution [21] using our complexity reduction framework.
←−−−−−−
Speed-up
←−−−−−−
R
sented in Section 4.2 always achieves by itself speed-ups over the
original formulation [21], consequently leading to systematic gains
in efficiency. Due to the extremely low degree of the representation of the Eddington phase function, the coefficients-amortization
formulation from Section 4.3 actually entails minor slow-downs
over the reduced formulation of Section 4.2 in that very specific
case. However, given the latter marginal efficiency loss compared
to the substantial speed-ups that it alone achieves in all other scenarios, the former precomputational approach is overall highly beneficial whenever memory access is not a limitation. With respect to
anisotropic phase functions, the reduced formulation of the medium
radiance presented in Section 4.1 yields negligible speed-ups for
the Eddington phase function as well as more considerable performance gains for the Rayleigh phase function. Due to the degenerate
nature of the spotlight representation allowing the external loop in
Equation 9 to skip all terms but the single one for which the cI (n)
coefficient is non-zero, there is no benefit in using Equation 22 in
that specific case and the original formulation of the medium radiance [21] is consequently preferable in such setting. However, for
more general light sources such as a light ball, the reduced medium
radiance formulation from Section 4.1 becomes highly beneficial as
it then alone achieves notable speed-ups.
7.2 High-Degree Angular Distributions on the CPU
In order to evaluate the potential speeds-ups achievable for higherdegree angular distributions, off-line experiments were also conducted using a single CPU core. Figure 5 illustrates for instance the
results obtained with a spotlight representation of degree 14. The
correct results were here as well rendered in closed form at a much
lower cost than the original formulation of Pegoraro et al. [21] using
our new formulations of the solution which yield a 11.0× speed-up.
Figure 6 also shows the results obtained for a high-frequency
anisotropic light distribution. Because the derivations are solely
based on exploiting the algebraic identities underlying the closedform solution without relying on approximations, our framework
Figure 6: A lighthouse whose high-frequency light source illuminates
the surrounding mist. The results are rendered offline using the original formulation of the closed form solution derived by Pegoraro et
al. [21] (left) while our new formulation is able to generate the exact
same reference image at a fraction of the cost (right).
entirely preserves the generality and robustness of the approach
and, as expected, is able to generate exact results strictly identical to those of the original solution while dramatically reducing the
computational cost of the evaluation scheme by a factor of 14.5×.
A break-down of the contributions of each individual component
of the mathematical framework is provided in Table 3 which details
the performance characteristics of the 6 computational schemes for
the 2 aforementioned offline scenarios. Because our approach allows for reductions in the order of complexity rather than a constant
speed-up factor, the improvements in efficiency achieved by our
method are increasingly important as the degree of the input function grows and the size of the problem increases. Here again, the reduced antiderivative formulation presented in Section 4.2 achieves
substantial speed-ups over the original formulation [21] which are
in turn dramatically improved by the coefficients-amortization formulation from Section 4.3. Similarly, the reduced medium radiance formulation from Section 4.1 yields marginal slow-downs for
spotlights while achieving considerable performance gains over the
original formulation [21] for the light ball representation.
8
D ISCUSSION
AND
F UTURE W ORK
Due to the typically small storage requirements, our current implementation uses a simple regularly structured array with constant
sizes in each dimension allowing for straightforward access to the
data. However, the table of coefficients as previously presented is in
essence relatively sparse due to the actually tighter bounds that the
respective indices themselves define on subsequent indices. Investigating alternative storage layouts that explicitly exploit the sparse-
Table 3: Break-down of the performance characteristics of the computational schemes for high-degree angular distributions, measured
in rendering times (in seconds per frame) on the CPU. For each of the
6 individual schemes introduced in Table 1 (shown in the 6 colored
cells), the results are reported for the 3 spotlights of Figure 5 at a resolution of 1024×512 (top) and the light ball of Figure 6 at a resolution
of 512×512 (bottom) all of degree 14. The overall speed-up achieved
by the optimal scheme (highlighted) compared to the original formulation (in red) is 11.0× and 14.5× for each anisotropic distribution
respectively, while the intermediate speed-ups contributed by the individual formulations are indicated between the corresponding cells
(progressively from left to right, and from top to bottom).
−−−−−−→
−−−−−−→
Ori. [21] Speed-up Red. (4.2) Speed-up C-r (4.3)
−−−−−−→
−−−−−−→
4776 SPF → 1.84× → 2599 SPF → 5.99× → 434 SPF
Ori. [21]
1663 SPF → 1.56× → 1068 SPF → 5.06× → 211 SPF
↓ 0.99× ↓ Spotlights14 ↓ 0.99× ↓ Spotlights14 ↓ 0.95× ↓
↓ 1.82× ↓ Light ball14 ↓ 2.05× ↓ Light ball14 ↓ 1.83× ↓
4808 SPF → 1.83× → 2632 SPF → 5.73× → 459 SPF
Red. (4.1)
915 SPF → 1.76× → 520 SPF → 4.52× → 115 SPF
R
←−−−
S.-up
←−−−
Lm \
ness of the data might consequently be beneficial for applications
potentially necessitating much higher-degree representations as required for detailed modeling of a Mie phase function for instance.
9
C ONCLUSION
In this paper, we have introduced novel alternative formulations of
the closed-form solution to the single-scattering air-light integral in
homogeneous media which provide a framework for effectively reducing the computational complexity (i.e. not just a simple constant
speed-up factor) of the evaluation scheme. Rather than relying on
ad-hoc approximations, the approach solely builds on excavating algebraic patterns and exploiting mathematical identities to simplify
the expressions, and as such, it entirely preserves both the generality and the accuracy of the closed-form solution.
The analytic derivations decrease the order of complexity of the
computation from O(N 6 ) to as low as O(N 4 ) or O(N 3 ) depending
on the configuration, where N − 1 is the degree of the input function. This in turn induces increasingly important speed-ups as the
size of the problem grows, as illustrated by practical examples of
reference images generated with significant gains in performance
ranging from about 2× to 15×. Although high-degree functions
may still not be handled interactively despite the considerable reductions in computational cost, the method does allow the efficient
generation of reference images to evaluate the quality of the various
approximations to the single-scattering model that have been proposed in the past, and represents a promising step forward towards
increasing the applicability of high-fidelity light transport simulation in participating media to both real-time and off-line rendering.
ACKNOWLEDGEMENTS
This research was supported by the German Research Foundation
(DFG) via the Cluster of Excellence on Multimodal Computing and
Interaction funded within the Excellence Initiative. The authors
also wish to thank the anonymous reviewers for their valuable feedback on improving the quality of this document. Sibenik Cathedral
model courtesy of Marko Dabrovic at RNA Studios, and city, drone
and lighthouse models courtesy of the Google 3D Warehouse.
R EFERENCES
[1] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. U.S. Department of Commerce, 1972.
[2] M. Billeter, E. Sintorn, and U. Assarsson. Real Time Volumetric Shadows using Polygonal Light Volumes. In High Performance Graphics,
pages 39–45, 2010.
[3] V. Biri, D. Arquès, and S. Michelin. Real Time Rendering of Atmospheric Scattering and Volumetric Shadows. Journal of WSCG,
14:65–72, 2006.
[4] J. F. Blinn. Light Reflection Functions for Simulation of Clouds and
Dusty Surfaces. SIGGRAPH Computer Graphics, 16(3):21–29, 1982.
[5] E. Cerezo, F. Perez-Cazorla, X. Pueyo, F. Seron, and F. Sillion. A Survey on Participating Media Rendering Techniques. The Visual Computer, 21(5):303–328, 2005.
[6] Y. Dobashi, T. Yamamoto, and T. Nishita. Interactive Rendering
Method for Displaying Shafts of Light. In Pacific Graphics, pages
31–37, 2000.
[7] Y. Dobashi, T. Yamamoto, and T. Nishita. Interactive Rendering of Atmospheric Scattering Effects Using Graphics Hardware. In Graphics
Hardware, pages 99–108, 2002.
[8] T. Engelhardt and C. Dachsbacher. Epipolar Sampling for Shadows
and Crepuscular Rays in Participating Media with Single Scattering.
In Symposium on Interactive 3D Graphics and Games, pages 119–
125, 2010.
[9] I. S. Gradshteyn and I. M. Ryzhik. Table of Integrals, Series, and
Products. Academic Press, 7th edition, 2007.
[10] N. Hoffman and A. J. Preetham. Real-Time Light-Atmosphere Interactions for Outdoor Scenes. In Graphics Programming Methods,
chapter 3.11, pages 337–352. 2003.
[11] T. Imagire, H. Johan, N. Tamura, and T. Nishita. Anti-Aliased and
Real-Time Rendering of Scenes with Light Scattering Effects. Visual
Computer, 23(9):935–944, 2007.
[12] R. James. True Volumetric Shadows. In Graphics Programming Methods, chapter 3.12, pages 353–366. 2003.
[13] P. Lecocq, S. Michelin, D. Arquès, and A. Kemeny. Mathematical Approximation for Real-Time Lighting Rendering through Participating
Media. In Pacific Graphics, pages 400–401, 2000.
[14] E. Makarov. Volume Light. NVIDIA White Paper, 2008.
[15] N. L. Max. Atmospheric Illumination and Shadows. SIGGRAPH,
20(4):117–124, 1986.
[16] K. Mitchell. Volumetric Light Scattering as a Post-Process. In GPU
Gems 3, chapter 13, pages 275–285. 2007.
[17] T. Nishita, Y. Miyawaki, and E. Nakamae. A Shading Model for
Atmospheric Scattering Considering Luminous Intensity Distribution
of Light Sources. SIGGRAPH Computer Graphics, 21(4):303–310,
1987.
[18] V. Pegoraro and S. G. Parker. An Analytical Solution to Single Scattering in Homogeneous Participating Media. Eurographics (Computer
Graphics Forum), 28(2):329–335, 2009.
[19] V. Pegoraro, M. Schott, and S. G. Parker. An Analytical Approach to
Single Scattering for Anisotropic Media and Light Distributions. In
Graphics Interface, pages 71–77, 2009.
[20] V. Pegoraro, M. Schott, and S. G. Parker. Reduced Dual-Formulation
for Analytical Anisotropic Single Scattering. High-Performance
Graphics (Poster Session), 2009.
[21] V. Pegoraro, M. Schott, and S. G. Parker. A Closed-Form Solution to
Single Scattering for General Phase Functions and Light Distributions.
Eurographics Symposium on Rendering (Computer Graphics Forum),
29(4):1365–1374, 2010.
[22] K. Riley, D. S. Ebert, M. Kraus, J. Tessendorf, and C. D. Hansen. Efficient Rendering of Atmospheric Phenomena. In Eurographics Symposium on Rendering, pages 375–386, 2004.
[23] B. Sun, R. Ramamoorthi, S. G. Narasimhan, and S. K. Nayar. A Practical Analytic Single Scattering Model for Real Time Rendering. SIGGRAPH (Transactions on Graphics), 24(3):1040–1049, 2005.
[24] P. J. Willis. Visual Simulation of Atmospheric Haze. Computer
Graphics Forum, 6(1):35–42, 1987.
[25] C. Wyman and S. Ramsey. Interactive Volumetric Shadows in Participating Media with Single-Scattering. In Symposium on Interactive
Ray Tracing, pages 87–92, 2008.
[26] K. Zhou, Q. Hou, M. Gong, J. Snyder, B. Guo, and H.-Y. Shum.
Fogshop: Real-Time Design and Rendering of Inhomogeneous,
Single-Scattering Media. In Pacific Graphics, pages 116–125, 2007.