Hierarchical Penumbra Casting Samuli Laine Timo Aila Helsinki University of Technology Hybrid Graphics, Ltd. Introduction • Rendering soft shadows – As usual, area light sources are sampled with a number of light samples – Multiple receiver points to be shaded • The main problem is solving the visibility – which light samples are visible to which receiver points What’s Happening? Light samples Light source Shadow caster Visible surface Receiver points On the Scale of the Problem • With R receiver points and L light samples there are RL visibility relations to solve – For example, 1024×768 resolution and 256 light samples gives over 200 million relations • Ray casting is the usual solution for solving the visibility relations – With T triangles, the cost of casting one shadow ray is O(log T) – Total cost becomes O(RL log T) About Ray Casting • The standard ray casting approach considers only one ray at a time – This inevitably leads to linear performance with respect to RL • However, this is highly flexible – We need to generate only one ray at a time • Sub-linear complexity with respect to T is achieved by placing the triangles into an acceleration hierarchy Transposing the Algorithm • Goal: sub-linear complexity w.r.t. RL • Requires rearranging the rendering loop Ray Tracing for each receiver point r linear to R for each light sample l linear to L find triangle that blocks ray l r sub-linear to T Our Approach for each triangle T find all l r rays blocked by T linear to T sub-linear to RL About Our Algorithm • Sub-linear complexity with respect to RL is achieved by placing the receiver points and light samples into acceleration hierarchies – Therefore, all receiver points must be gathered before computing the shadows • We process one triangle at a time – Good: no need for triangle BSP – Bad: linear complexity with respect to T About Our Algorithm, part 2 • The full rendering process goes as follows: 1. Ray-trace or rasterize the image without shadows to get the receiver points 2. Build the acceleration structures for receiver points and light samples 3. Process all triangles to solve the visibility relations between light samples and receiver points 4. Perform shading The Acceleration Structures • Fixed three-level bounding volume hierarchy is used for the light samples – Assuming a polygonal light source, bounding “volumes” are actually bounding polygons • Standard bounding volume hierarchy is used for the receiver points – Axis-aligned boxes as bounding volumes Light Sample Hierarchy • Three levels • All nodes have a bounding volume Root node Middle nodes Leaf nodes Entire light source Light sample groups Light samples Storing the visibility information • A bit mask with L bits is assigned for every receiver point – bit = 0: light sample is visible – bit = 1: light sample is occluded • Initially, all bits are zero • When a triangle is found to occlude a light sample from a receiver point, the corresponding bit is set to one Penumbra Volumes • All points where a triangle may block a ray from a bounding volume are inside the corresponding penumbra volume Triangle Bounding volume in light hierarchy Penumbra volume Processing a Triangle • First build penumbra volumes for all nodes in the light sample hierarchy • For individual light samples (leaf nodes) these become hard shadow volumes Processing a Triangle • Traverse down the receiver point hierarchy • Step 1: Test intersection between main penumbra volume and bounding volume of receiver node Triangle Bounding volume of entire light source Receiver node Main penumbra volume Processing a Triangle • Step 2: Update the list of active light sample groups – At beginning of traversal, all groups are active Triangle Bounding volumes of light sample groups Remove from active group list Receiver node Processing a Triangle • Step 3: Recurse into child nodes in receiver hierarchy – With pruned list of active light sample groups Triangle Bounding volume of entire light source Child nodes Main penumbra volume Processing a Triangle • Step 4: In leaf node, test receiver points vs. hard shadow volumes of light samples – Update the visibility relation bits Triangle Light samples in active groups Receiver points Summary of Recursion • Traverse down the receiver point hierarchy – Maintain list of active light sample groups • Initially all groups are active – First ensure that receiver node intersects the main penumbra volume, terminate otherwise – Then prune the active light sample group list by intersecting receiver node vs. penumbra volumes of active light sample groups – In leaf node, test receiver points against hard shadow volumes of remaining light samples Optimizations • Umbra bits for early traversal termination – With receiver hierarchy rebuilding to ensure balance • Active plane sets • Lazy penumbra volume and hard shadow volume construction • On-demand bit mask allocation • Coarse blocker sorting Extensions • Multiple light sample sets – To remove banding artifacts • Alpha matte textures – Often used in e.g. vegetation textures • Adaptive antialiasing • Volumetric light sources Results • Compared against Mental Ray 3.2 • Benchmarked only the solving of the visibility relations – For Mental Ray, computed both with and without shadows and took the difference • More detailed results in the paper Grids 32K triangles 256 light samples Resolution 1280×9600 2560×1920 Peak mem usage 058M 228M Speedup factor 13.5 16.7 Flowers 903K triangles 256 light samples Resolution 1280×9600 2560×1920 Peak mem usage 039M 154M Speedup factor 3.5 7.8 Sponza 1.27M triangles 256 light samples Resolution 1280×9600 2560×1920 Peak mem usage 062M 244M Speedup factor 8.2 11.4 Results: Analysis • Sub-linearity with respect to R – Increasing output resolution gives better relative performance – Due to hierarchical processing of receiver points • Sub-linearity with respect to L – Using more light samples gives better relative performance (results in the paper) – Due to using analytic penumbra volumes that represent many light samples at once Results: More Analysis • Somewhat high memory usage – Depends on the output resolution – Depends on the complexity of the shadows – Does not depend on the number of triangles in the scene • New problem: dependence on the spatial size of light source – Penumbra volumes become larger – Leads to lower performance Conclusions • Nice properties + Exactly the same result as with ray casting + No need to store all triangles at any point + Sub-linear dependence on output resolution and number of light samples • Not so nice properties – Linear dependence on triangle count – Memory usage can be high – Dependence on the spatial size of light source Future Work • Process multiple triangles at a time? • Could experiment with full light sample hierarchy, which should (in theory) have better performance Thank You • Questions Funding: National Technology Agency of Finland, Bitboys, Hybrid Graphics, Remedy Entertainment, Nokia, ATI
© Copyright 2026 Paperzz