HIGH PERFORMANCE VIDEO ENCODING USING NVIDIA GPUS Abhijit Patait Sr. Manager, GPU Multimedia SW AGENDA Overview GPU Video Encoding NVIDIA Video Encoding Capabilities — Kepler vs Maxwell GPU capabilities — Roadmap Software API Performance & Quality WHY GPU VIDEO ENCODING? BENEFITS OF ENCODING ON GPU Low power — Fixed function hardware — Reduced memory transfers Low latency High performance Higher density Scalability Ease of Programming — Linux, Windows, C/C++, Application portability NVIDIA GPU VIDEO ENCODING CAPABILITIES NVIDIA GPU ENCODING CAPABILITIES Feature Benefits H.264 base, main, high profiles Wide range of use-cases High performance (Up to 16x HD) “Blazing-speed” encoding YUV 4:2:0 and 4:4:4 support High quality encoding without chroma subsampling QP maps Customizable quality, region of interest encoding MVC Full resolution stereo encode Up to 4096 × 4096 in HW High resolution encode API - NV Encode SDK & GRID SDK Flexible, Win/Linux, DirectX/CUDA Independent of CUDA Use CUDA and encode simultaneously VIDEO ENCODING — KEPLER VS. MAXWELL Kepler (GK104, GK107, GK106, GK110, GK208) Maxwell (GM107) Planar 4:4:4 Standard 4:4:4 and H.264 lossless encoding ~240 fps 2-pass encoding @ 720p ~500 fps 2-pass encoding @ 720p GRID K340/K520, K1/K2, Quadro, Tesla K10/K20 Current and future Maxwell GPU-boards GeForce – 2 full-speed encode sessions/GPU GeForce – 2 full-speed encode sessions/GPU NV Encode SDK 1.0, 2.0, 3.0 (Now) NV Encode SDK 4.0+ (May 2014) GRID SDK 1.x, 2.2, 2.3 (Now) GRID SDK 3.0+ (June 2014) NVIDIA VIDEO ENCODING ROADMAP Performance improvements Quality improvements — 4:4:4 & lossless encoding — Rate control enhancements — Adaptive quantization — ROI, ME-only mode New video standards NVENC SOFTWARE APIS USING NVENC • • • • • No capture Transcoding Archiving Video editing CUDA pre-process + encoding • Granular encoder settings • D3D, CUDA interop Capture + Encode GRID SDK NVENC SDK Direct Encode • Capture + encode • Optimized for lowlatency apps • Capture + CUDA preprocess + encoding • Encoder settings optimized for streaming • D3D, CUDA interop DIRECT ENCODE (NVENC SDK) Client application Encoded bitstream Initialize, Configure, Encode NVENC API Configure HW CUDA Driver NVENC Driver DirectX Driver HW Encode NVENC firmware + hardware CAPTURE AND ENCODE (GRID SDK) Client application Encoded Bitstream DX/OGL Present NvFBC/NvIFR Capture NVENC Driver YUV DirectX/OGL Driver Encode NVENC Hardware GPU 3D Engine NVENC SDK Available on NVIDIA developer zone — https://developer.nvidia.com/nvidia-video-codec-sdk — Current release 3.0 — Release 4.0 in May 2014 with Maxwell support Interface header, documentation, sample application — .dll/.so included in the driver Unified API for Windows and Linux Works on x86/x64 Various API’s, presets, rate control modes for — Transcoding — Video conferencing — GTC Session S4654 NVENC SDK (CONTD.) Advantages — Flexibility Dynamic resolution/bitrate change CABAC vs CAVLC; low-level encoder settings, B-frames, sync vs async, custom QP Linux, Windows, DirectX, CUDA, OGL (via CUDA) Also works on GeForce hardware (2 sessions/GPU) — Error concealment Reference picture invalidation Intra-refresh — Quality Two-pass modes for higher quality Various presets with quality/performance trade-off 4:4:4 & lossless encoding (Maxwell only) GRID SDK ENCODE Available on NVIDIA developer zone — https://developer.nvidia.com/grid-app-game-streaming — Current release: 2.2 Interface header, documentation, sample apps — .dll/.so included in the driver Windows and Linux Works on x86/x64 Various presets and API’s for — Remote graphics (Cloud gaming, remote desktop, capture & stream) Optimized for low latency GRID SDK (CONTD.) Advantages — Simplicity Very simple API; single function call for capture + H.264 encode — Low-latency, high performance Optimized API — Error concealment Reference picture invalidation Intra-refresh — Quality Two-pass modes for higher quality 4:4:4 & lossless encoding (Maxwell only) PERFORMANCE AND QUALITY PERFORMANCE – 720P NVENC Performance at 720p, Low-Latency HP preset Rate control modes 231 fps CBR_IFRAME_2PASS 504 fps 232 fps 2_PASS_FRAMESIZE_CAP Kepler (GRID) 503 fps Maxwell 232 fps 2_PASS_QUALITY 505 fps 100 200 300 400 500 600 720p Performance (fps) Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application PERFORMANCE – 1080P NVENC Performance at 1080p, Low-Latency HP preset Rate control modes 118 fps CBR_IFRAME_2PASS 239 fps 118 fps 2_PASS_FRAMESIZE_CAP 240 fps Kepler (GRID) Maxwell 119 fps 2_PASS_QUALITY 238 fps 50 100 150 200 250 1080p Performance (fps) Performance measured on GRID K520 with GRID SDK NVENC performance benchmarking application ENCODING QUALITY VS X264 – ASSUMPTIONS Infinite GOP IPPP… VBV buffer = bitrate/framerate x264 — Zero latency — CRF = 24 — Preset = faster NVENC — Preset = LOW_LATENCY_HQ — RC = 2-pass-quality NVENC/X264 QUALITY COMPARISON Titan Fall 720p, 5 Mbps, Low-latency HQ 45 1.2 40 1.1 PSNR Y (dB) 35 1 0.9 25 SSIM Y PSNR Y (dB) 30 20 0.8 SSIM Y 0.7 10 0.6 5 0 0.5 101 201 PSNR x264 SSIM NVENC SSIM x264 15 1 PSNR NVENC 301 401 501 601 701 801 901 NVENC/X264 QUALITY COMPARISON Bunny 1080p, 12 Mbps, Low-latency HQ 60 1.5 1.4 50 1.3 40 PSNR Y (dB) 30 1.1 1 20 0.9 SSIM Y 10 0.8 0 0.7 1 101 201 301 401 501 SSIM Y PSNR Y (dB) 1.2 PSNR NVENC PSNR x264 SSIM NVENC SSIM x264 QUALITY COMPARISON – PSNR PSNR Comparison - x264 vs NVENC 50.00 dB 45.00 dB 40.00 dB PSNR Y (dB) 35.00 dB 30.00 dB 25.00 dB 20.00 dB 15.00 dB 10.00 dB 5.00 dB 0.00 dB -5.00 dB PSNR NVENC PSNR x264 Bunny 1080p 47.24 dB 43.71 dB NFS Rivals 720p 34.05 dB 33.18 dB NFS Rivals 1080p 35.51 dB 34.39 dB Titan Fall 720p 30.58 dB 29.78 dB Titan Fall 1080p 28.13 dB 30.63 dB WoT - 3 1280 × 768 34.15 dB 33.41 dB WoT - 12 1280 × 768 35.60 dB 34.72 dB PSNR Difference 3.52 dB 0.87 dB 1.12 dB 0.80 dB -2.50 dB 0.74 dB 0.87 dB QUALITY COMPARISON – SSIM SSIM Comparison - x264 vs NVENC 1.0000 0.8000 SSIM Y 0.6000 0.4000 0.2000 0.0000 -0.2000 SSIM NVENC 0.9874 NFS Rivals 720p 0.9217 SSIM x264 0.9808 0.9103 0.9269 0.8073 0.8567 0.8930 0.9027 0.01 0.01 0.01 0.03 -0.03 0.02 0.01 SSIM Difference Bunny 1080p NFS Rivals 1080p 0.9388 Titan Fall 720p 0.8350 Titan Fall 1080p 0.8309 WoT - 3 1280 × 768 0.9101 WoT - 12 1280 × 768 0.9169 QUESTIONS?
© Copyright 2026 Paperzz