- Imagination Community

PowerVR Tips and Tricks:
Maximise your Performance
September, 2013
© Imagination Technologies
www.imgtec.com
p1
Performance Recommendations
Introduction
© Imagination Technologies
www.imgtec.com
p2
Common Bottlenecks
Based on past observation
Most Likely
CPU Usage
Bandwidth Usage
CPU/GPU Synchronisation
Fragment Shader Instructions
Geometry Upload
Texture Upload
Vertex Shader Instructions
Geometry Complexity
Least Likely
© Imagination Technologies
p3
Warning!
Some of these rules may seem obvious to you.
We still see them broken everyday.
If you know them, please bear with us!
© Imagination Technologies
p4
Performance Recommendations
The Golden Rules
© Imagination Technologies
www.imgtec.com
p5
Understand Your Target Device
Golden Rule 1
 No two devices are the same
 Different SoCs will have different bottlenecks
 The GPU isn’t necessarily your bottleneck
 Graphics driver, OS, CPU-GPU synchronization etc.
© Imagination Technologies
p6
Don’t Waste GPU Time
Golden Rule 2
 The principle of “Good Enough”
 If the user won't notice the difference, don’t bother!
 Keep an eye on polygon count
 Use suitable texture resolutions
 Does this shader need to be so complex?
© Imagination Technologies
p7
Promote Calculations up The Chain
Golden Rule 3
 Avoid unnecessary calculations
 If you can do it once per scene, do it once per scene
 If you can’t, try and do it per vertex
 Generally fewer vertices in a scene than fragments
 Some computation can be done offline
 E.g. Lighting
 Remember, ‘Good Enough’!
© Imagination Technologies
p8
Don’t Access an Active Render Target
Golden Rule 4
 Accessing a render target from the CPU is very bad for performance
 If it’s not done properly, serialization will occur (which is bad!)
© Imagination Technologies
p9
Avoid Accessing Buffers
Golden Rule 5
 Similar to render targets
 Buffer data might still be in-flight
 Any processing has to finish - typically vertex work
 Usual solution - use a circular array of buffers
 Series6 devices will do this automatically for you
 Not a free pass though!
© Imagination Technologies
p10
Use Vertex Buffers and Indexed Geometry
Golden Rule 6
 Vertex buffers benefit from driver level optimisations
 Index your geometry
 Less duplicate data sent to the hardware
 Sort geometry
 Sort vertex & index buffers for better vertex cache efficiency
 PVRGeoPOD will do this for you!
 Use static buffers where possible
 Dynamic data is different
 Series5: Use vertex pointers (AKA client side arrays)
 Series6: Use buffers and mapping functions
© Imagination Technologies
p11
Batch Your Draw Calls
Golden Rule 7
 Group static objects, and draw once
 When objects are static relative to each other
 E.g. Seat in a Train
 Sort objects by render state
 Emphasis on texture and program state changes
 Try using texture atlases, or texture arrays when available
© Imagination Technologies
p12
Compress Your Textures
Golden Rule 8
 The lower the bitrate the less bandwidth consumed
 Use PVRTC – down to 2 bits per pixel!
 PVRTexTool will handle this!
 Don’t confuse this file compression
 E.g. PNG or JPEG
 These are decompressed before going to the hardware
 PVRTC is read directly from the compressed form
 It stays in memory at 2bpp or 4bpp
 Remember “Good Enough”
© Imagination Technologies
p13
Avoid Alpha Test/Discard
Golden Rule 9
 Alpha test negates advantages of Hidden Surface Removal or ‘Early-Z’
 Fragment visibility isn’t known until fragment shader is run
 Prefer blending, and render in the order: Opaque, Alpha Tested, Blended
 Makes best use of HSR
© Imagination Technologies
p14
Avoid Framebuffer Transfers
Golden Rule 10
 High bandwidth cost for framebuffer transfers
 Can be avoided on tiling architectures!
 Slightly tricky concept - applies to any tiling architectures.
 Instead of one big framebuffer, small, re-used tile buffers
 Copied to/from main memory for display purposes
 Tell the GPU what you don't need!
 Clear at the start of a frame
 Discard/Invalidate at the end of a frame
© Imagination Technologies
p15
Profiling & Debugging
The Right Tools
© Imagination Technologies
www.imgtec.com
p16
The Right Tools For The Job
 PowerVR Graphics SDK
 Utilities, documentation & example source
 PVRTrace
 OpenGL ES API capture and analysis tool
 PVRTune
 Real-time PowerVR GPU performance analyser
PVRTrace
 PVRShaderEditor
 As-you-type shader profiling
PVRTune
© Imagination Technologies
p17
Profiling & Debugging
PVRTrace
© Imagination Technologies
www.imgtec.com
p18
PVRTrace
 OpenGL ES API tracer
 Graphical Interface for analysis
 Recording libraries
 Features
 Intercept and record OpenGL ES calls
 Replay PVRT trace captures
 Static call analysis
 Inspect render state
 At a glance debugging
© Imagination Technologies
p19
PVRTrace
Static Call Analysis
 Highlights GL Errors
 Highlights possible performance problems
 Shows you where the problem is
 At a glance debugging
© Imagination Technologies
p20
PVRTrace
Shader Analysis
 Charts the total cost of shaders
 Estimated cycle count
 How many instances are executed
 What are the most expensive shaders?
 Where should you focus any optimisations?
© Imagination Technologies
p21
PVRTrace
Inspect the render state
 Deciphers your GL calls
 What state GL is in at any point
 Highlights state changes
 Easily see what changed without digging
© Imagination Technologies
p22
PVRTrace
Statistics Graphing
 Visualise render statistics
 Draw calls per frame
 Polygons submitted
 Etc
 At a glance information
 Which threads are active?
 When do expensive functions get called?
© Imagination Technologies
p23
PVRTrace
Image Analysis
 Four visualisation modes
 Playback
 Wireframe
 Depth Complexity
 Pixel Complexity
 Step-through draw calls
 See the results of one, many or all draw calls
© Imagination Technologies
p24
Profiling & Debugging
PVRTune
© Imagination Technologies
www.imgtec.com
p25
PVRTune
 PowerVR GPU performance analyser
 Graphical Interface for analysis
 Server application on device
 Features
 Real-time performance data
 Runs parallel to your application
 Easily identify bottlenecks
© Imagination Technologies
p26
PVRTune
Identifying Your Bottleneck
 Quickly Identify idle time and workloads
 Intuitive timing blocks for Vertex and Fragment work
 Many useful counters and timing data
 Working hard or hardly working?
 See what's being stressed…
GPU is idle
 …or sitting idle
© Imagination Technologies
p27
PVRTune
Modify the OpenGL ES render state – no app changes needed!
 Globally disable or enable state
 Tweak your render in real time
 How is performance affected when I…?
 Quickly prototype changes
© Imagination Technologies
p28
PVRTune
OpenGL ES timing data & counters
 Overlay data about various function calls
 See what's being dispatched and when
 And how long it takes to return (CPU)
© Imagination Technologies
p29
Profiling & Debugging
PVRHub
© Imagination Technologies
www.imgtec.com
p30
PVRHub
 Device side configuration tool
 Android GUI Application
 Linux Scripts
 Features
 Ease-of-use for PVRTrace/PVRTune
 Install PVRTrace libraries
 Configure what PVRTrace captures
© Imagination Technologies
p31
Profiling & Debugging
PVRScope
© Imagination Technologies
www.imgtec.com
p32
PVRScope
 Performance analysis library
 Static library
 Features
 Retrieve hardware counters in your application
 Augment PVRTune with custom data
© Imagination Technologies
p33
Profiling & Debugging
PVRShaderEditor
© Imagination Technologies
www.imgtec.com
p34
PVRShaderEditor
 Shader editing and profiling
 Graphical Interface
 Features
 Syntax highlighting
 As-you-type profile information
 Integrates all features of Profiling Compilers
 Disassembly viewer with NDA compiler
© Imagination Technologies
p35
Questions?
© Imagination Technologies
www.imgtec.com
p36
PowerVR Tips and Tricks:
Maximise your performance
September, 2013
© Imagination Technologies
www.imgtec.com
p37