Status – Week 279 Victor Moya Rasterization Setup triangles (calculate slope values). Fill triangle: Interpolate parameters. Parameters: R, G, B, z, r, s, t, q. Pixel Planes Calculate 3 edge functions: if all the edge functions are positive in a point (x, y) the point is inside the triangle. E(x, y) = (x – X)dY – (y – Y)dX E(x, y) > 0 if (x, y) is to the “right” side. E(x, y) = 0 if (x, y) is exactly on the line. E(x, y) < 0 if (x, y) is to the “left” side. Edge Functions Classification (1) A polygon defined by N vertex: (xi, yi) 0 < i <= N (x0, y0) = (xN, yN) The incremental classification of the points around a polygon can be calculated as: Initial values: dXi = Xi – X(i-1) dYi = Yi – Y(i-1) Ei(Xs, Ys) = (Xs – Xi) dY – (Ys – Yi) dXi for 0 < i <= N Classification(2) Incremental computation for a unit step in X and Y axis: E(x + 1, y) = Ei(x, y) + dYi E(x - 1, y) = Ei(x, y) - dYi E(x, y + 1) = Ei(x, y) - dYi E(x, y - 1) = Ei(x, y) + dXi Fragment inside of the triangle if: Ei >= 0 for all i : 0 < i <= N Classification Traversing the Polygon Clipping Parallel Rasterization E(x + L, y) = E(x) + Ldy Allows a group of interpolators, each responsible for a pixel within a block of contiguous pixels, to simultaneously compute the edge function of an adjacent block in a single cycle Olano and Greer Triangle Scan Conversion using 2D Homogeneous Coordinates Based in Pixel Planes and Pineda approach (edge functions) but using homogeneous coordinates. Avoids the need of clipping. Adds a hither edge function for user clipping. Perspective correct interpolation. Interpolation function A parameter varies linearly accross a triangle in 3D: u = aX + bY + cZ The 3D position (X, Y, Z) projects to 2D, using 2DH coords (x = X, y = Y , w = Z). The equation in 2DH space: u = ax + by + cw 2D perspective correct function (division by w): u/w = a x/w + b y/w + c = a X + b Y + c u/w is a linear function in screen space (X, Y) Interpolation function If each vertex has a a value for u we can resolve [a b c] using this equation: Scan conversion Edge function parameters: [1 0 0], [0 1 0], [0 0 1]. 1/w interpolation parameter: [1 1 1]. Zero-area and back facing triangles: 3x3 matrix inverse of M only exists if the determinant of M isn’t 0. The determinant calculates a function of the area of the triangle. Arbitrary clip planes To add arbitrary clip planes (user clip planes) we need to add new clip edge functions: Algorithm To summarize the algorithm: setup: three edge functions = M-1 = inverse of 2D homogeneous vertex matrix for each clip edge clip edge function = dot product test * M-1 interpolation function for 1/w = sum of rows of M-1 for each parameter interpolation function = parameter vector * M-1 pixel processing: interpolate linear edge and parameter functions where all edge functions are positive w = 1/(1/w) for each parameter perspective-correct parameter = parameter * w Cost Setup: Calculate the interpolation coefficients and slopes. 1 matrix inversion (1 division, multiple multiplication/additions). 1 matrix vector multiplication for each parameter. This includes the edge and clip edge functions, the 1/w value and the other parameters (r, g, b, z, s, t, r) (3x3 matrix/vector multiplication: 9 Mul + 6 Add). Calculate the X and Y slopes (derivatives) for each parameter and the initial value at the first pixels (2 Mul + 2 Add per parameter). Cost (2) Per pixel: Interpolate parameters: 1 Addition per parameter. Determine if the 3 edge functions are positive (3 test sign). Determine if the clip edge functions are positive (n test sign) Per pixel inside the triangle: w = 1/(1/w) (1 division????) For each parameter, perspective correct parameter value: u = uw * w (1 multiplication for each parameter). Rasterization/Fragments Calculate the final color value of the fragment: Texture Read. Color sum. Fog. OpenGL Rasterization Per fragment (tests) Determine the vissibility of the fragment: Ownership test. Scissor test. Alpha test. Stencil test. Depth Buffer test. Final pixel color: Blending. Dithering. Logic Operation. OpenGL per fragment OpenGL Multitexture Z-Buffer Vissibility test. 1 read from the Z-buffer (24bits). If test fails the fragment is discarded. If not 1 write to the Z-buffer (24 bits). Early Z test (avoid useless work). Hierarchical Z-Buffer: reduces bandwidth Z-Buffer compression: reduces bandwidth and memory usage. Fast Z clear. Pixel shaders that change pixel depth (Z) disable early Z test. Hierarchical Z, Z Compression and Fast Z-Clear Textures Original: additional color (material) information per pixel. It is used to compensate lack of geometry information. Current: color, normals or any kind of information. Different formats (access) supporter by hardware (1D, 2D, 3D, cubemap). Supported dependant reads (use information from a texture as address to access another texture). Minimification, magnification. MIP mapping (Multus in Parvum): multiple levels of detail for a single texture. Filtering: bilinear (4 access same mipmap), trilinear (8 access to two mipmaps), anisotropic (up to 128 access (16x trilinear) access). Register combiners Multitexture: multiple textures can be read per cycle (multiple texture units per pipe, up to 4 in Matrox Parhelia). Also multiple textures per pass (loop mode, up to 16 in DX9 hardware). The output of those textures is combined (*, +, ...) with the pixel interpolated color. First implementation of pixel shaders (not really instructions for a processor, but a configuration for the hardware). GeForce256 Register Combiners 4 RGB Inputs Fragment Color 4 Alpha Inputs 3 RGB Outputs Specular Color General Combiner 0 Fog Color/Factor Texture 0 Texture Fetching Texture 1 Register Set 3 Alpha Outputs 4 RGB Inputs 4 Alpha Inputs 3 RGB Outputs General Combiner 1 3 Alpha Outputs Spare 0 Specular Color 6 RGB Inputs 1 Alpha Input Final Combiner GeForce 3/4 Register Combiners GeForce 3/4 Register Combiners GeForce 3/4 Register Combiners Texture Effects There is a large a new graphics effects that can be achieved with those extended texture functions: Cubemap (lightning, shadows). Bump Mapping (per pixel lightning/shading). Others? Pixel Shaders DX9 pixel shaders are true processors. Based in Vertex Shaders but without branching. Replaces (or complements) the register combiner stage. Most instructions of the vertex shader are present in the pixel shader (but branches). Conditional codes, swizzle, negate, absolute value, mask, conditional mask (NV30). Additional instructions (NV30): Texture read: TEX, TEXP, TXD. Partial derivarives: DDX, DDY. Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H, UP2US, UP4B, UP4UB, UP4UBG. Fragment conditional kill: KIL. Extra math: LRP (linear interpolation), X2D (2D coordinate transform), RFL (reflection), POW (exponentation). R300 Pixel Shader Pixel Shader Inputs: 1 position (x, y, z, 1/w), 2 colors (4 compenent vector RGBA), 8 texture coordinates, 1 fog coordinate. Outputs: fragment color (RGBA), optionally new fragment depth. In NV30/R300 also to 4 RGBA textures. Temporaries (NV30): 32 32-bit registers (64 16-bit registers). Constants (NV30): unlimited? (maybe memory?). Accessed by ‘name’ (label). Also literal constants (embedded). R300: 12 temporary registers, 32 constants. 16 samplers and 8 texture coordinates (DX9). Pixel Shader R300: 64 ALU instructions, 32 texture instructions, 4 levels of dependent read. Up to 96 instructions (?). R300: ALU instructions: ADD, MOV, MUL, MAD, DP3, DP4, FRAC, RCP, RSP, EXD, LOG, CMP. Texture: TEXLD, TEXLDP, TEXLDBIAS, TEXKILL. NV30: up to 1024 instructions. Others Fog. Scissor and Ownership test. Stencil test. Alpha test. Blending. Antialiasing. Etc.
© Copyright 2026 Paperzz