Status – Week 283

Status – Week 279
Victor Moya
Rasterization
Setup triangles (calculate slope values).
 Fill triangle: Interpolate parameters.
 Parameters: R, G, B, z, r, s, t, q.

Pixel Planes

Calculate 3 edge functions: if all the
edge functions are positive in a point
(x, y) the point is inside the triangle.
E(x, y) = (x – X)dY – (y – Y)dX
E(x, y) > 0 if (x, y) is to the “right” side.
E(x, y) = 0 if (x, y) is exactly on the line.
E(x, y) < 0 if (x, y) is to the “left” side.
Edge Functions
Classification (1)
A polygon defined by N vertex:
(xi, yi)
0 < i <= N
(x0, y0) = (xN, yN)
The incremental classification of the points around a polygon can
be calculated as:
Initial values:
dXi = Xi – X(i-1)
dYi = Yi – Y(i-1)
Ei(Xs, Ys) = (Xs – Xi) dY – (Ys – Yi) dXi
for 0 < i <= N
Classification(2)
Incremental computation for a unit step in X and Y axis:
E(x + 1, y) = Ei(x, y) + dYi
E(x - 1, y) = Ei(x, y) - dYi
E(x, y + 1) = Ei(x, y) - dYi
E(x, y - 1) = Ei(x, y) + dXi
Fragment inside of the triangle if:
Ei >= 0 for all i : 0 < i <= N
Classification
Traversing the Polygon
Clipping
Parallel Rasterization
E(x + L, y) = E(x) + Ldy
Allows a group of
interpolators, each
responsible for a pixel
within a block of
contiguous pixels, to
simultaneously compute
the edge function of an
adjacent block in a
single cycle
Olano and Greer





Triangle Scan Conversion using 2D
Homogeneous Coordinates
Based in Pixel Planes and Pineda approach
(edge functions) but using homogeneous
coordinates.
Avoids the need of clipping.
Adds a hither edge function for user clipping.
Perspective correct interpolation.
Interpolation function
A parameter varies linearly accross a triangle in 3D:
u = aX + bY + cZ
The 3D position (X, Y, Z) projects to 2D, using 2DH coords (x = X, y = Y ,
w = Z). The equation in 2DH space:
u = ax + by + cw
2D perspective correct function (division by w):
u/w = a x/w + b y/w + c = a X + b Y + c
u/w is a linear function in screen space (X, Y)
Interpolation function

If each vertex has a a value for u we
can resolve [a b c] using this equation:
Scan conversion
Edge function parameters: [1 0 0], [0 1
0], [0 0 1].
 1/w interpolation parameter: [1 1 1].
 Zero-area and back facing triangles: 3x3
matrix inverse of M only exists if the
determinant of M isn’t 0. The
determinant calculates a function of the
area of the triangle.

Arbitrary clip planes

To add arbitrary clip planes (user clip
planes) we need to add new clip edge
functions:
Algorithm
To summarize the algorithm:
setup:
three edge functions = M-1
= inverse of 2D homogeneous vertex matrix for each clip edge
clip edge function = dot product test * M-1
interpolation function for 1/w = sum of rows of M-1
for each parameter
interpolation function = parameter vector * M-1
pixel processing:
interpolate linear edge and parameter functions
where all edge functions are positive
w = 1/(1/w)
for each parameter
perspective-correct parameter = parameter * w
Cost

Setup:




Calculate the interpolation coefficients and slopes.
1 matrix inversion (1 division, multiple
multiplication/additions).
1 matrix vector multiplication for each parameter.
This includes the edge and clip edge functions, the
1/w value and the other parameters (r, g, b, z, s,
t, r) (3x3 matrix/vector multiplication: 9 Mul + 6
Add).
Calculate the X and Y slopes (derivatives) for each
parameter and the initial value at the first pixels (2
Mul + 2 Add per parameter).
Cost (2)

Per pixel:




Interpolate parameters: 1 Addition per parameter.
Determine if the 3 edge functions are positive (3
test sign).
Determine if the clip edge functions are positive (n
test sign)
Per pixel inside the triangle:


w = 1/(1/w) (1 division????)
For each parameter, perspective correct parameter value:
u = uw * w (1 multiplication for each parameter).
Rasterization/Fragments

Calculate the final color value of the
fragment:
Texture Read.
 Color sum.
 Fog.

OpenGL Rasterization
Per fragment (tests)

Determine the vissibility of the fragment:






Ownership test.
Scissor test.
Alpha test.
Stencil test.
Depth Buffer test.
Final pixel color:



Blending.
Dithering.
Logic Operation.
OpenGL per fragment
OpenGL Multitexture
Z-Buffer









Vissibility test.
1 read from the Z-buffer (24bits).
If test fails the fragment is discarded.
If not 1 write to the Z-buffer (24 bits).
Early Z test (avoid useless work).
Hierarchical Z-Buffer: reduces bandwidth
Z-Buffer compression: reduces bandwidth and
memory usage.
Fast Z clear.
Pixel shaders that change pixel depth (Z) disable
early Z test.
Hierarchical Z, Z Compression and Fast Z-Clear
Textures






Original: additional color (material) information per pixel. It is
used to compensate lack of geometry information.
Current: color, normals or any kind of information. Different
formats (access) supporter by hardware (1D, 2D, 3D, cubemap).
Supported dependant reads (use information from a texture as
address to access another texture).
Minimification, magnification.
MIP mapping (Multus in Parvum): multiple levels of detail for a
single texture.
Filtering: bilinear (4 access same mipmap), trilinear (8 access to
two mipmaps), anisotropic (up to 128 access (16x trilinear)
access).
Register combiners
Multitexture: multiple textures can be read
per cycle (multiple texture units per pipe, up
to 4 in Matrox Parhelia). Also multiple
textures per pass (loop mode, up to 16 in
DX9 hardware).
 The output of those textures is combined (*,
+, ...) with the pixel interpolated color.
 First implementation of pixel shaders (not
really instructions for a processor, but a
configuration for the hardware).

GeForce256 Register Combiners
4 RGB Inputs
Fragment Color
4 Alpha Inputs
3 RGB Outputs
Specular Color
General
Combiner
0
Fog Color/Factor
Texture 0
Texture
Fetching
Texture 1
Register Set
3 Alpha Outputs
4 RGB Inputs
4 Alpha Inputs
3 RGB Outputs
General
Combiner
1
3 Alpha Outputs
Spare 0
Specular Color
6 RGB Inputs
1 Alpha Input
Final
Combiner
GeForce 3/4 Register Combiners
GeForce 3/4 Register Combiners
GeForce 3/4 Register Combiners
Texture Effects

There is a large a new graphics effects
that can be achieved with those
extended texture functions:
Cubemap (lightning, shadows).
 Bump Mapping (per pixel
lightning/shading).
 Others?

Pixel Shaders



DX9 pixel shaders are true processors. Based in Vertex Shaders
but without branching. Replaces (or complements) the register
combiner stage.
Most instructions of the vertex shader are present in the pixel
shader (but branches). Conditional codes, swizzle, negate,
absolute value, mask, conditional mask (NV30).
Additional instructions (NV30):
 Texture read: TEX, TEXP, TXD.
 Partial derivarives: DDX, DDY.
 Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H,
UP2US, UP4B, UP4UB, UP4UBG.
 Fragment conditional kill: KIL.
 Extra math: LRP (linear interpolation), X2D (2D coordinate
transform), RFL (reflection), POW (exponentation).
R300 Pixel Shader
Pixel Shader






Inputs: 1 position (x, y, z, 1/w), 2 colors (4
compenent vector RGBA), 8 texture coordinates, 1
fog coordinate.
Outputs: fragment color (RGBA), optionally new
fragment depth. In NV30/R300 also to 4 RGBA
textures.
Temporaries (NV30): 32 32-bit registers (64 16-bit
registers).
Constants (NV30): unlimited? (maybe memory?).
Accessed by ‘name’ (label). Also literal constants
(embedded).
R300: 12 temporary registers, 32 constants.
16 samplers and 8 texture coordinates (DX9).
Pixel Shader
R300: 64 ALU instructions, 32 texture
instructions, 4 levels of dependent read. Up
to 96 instructions (?).
 R300:




ALU instructions: ADD, MOV, MUL, MAD, DP3,
DP4, FRAC, RCP, RSP, EXD, LOG, CMP.
Texture: TEXLD, TEXLDP, TEXLDBIAS, TEXKILL.
NV30: up to 1024 instructions.
Others
Fog.
 Scissor and Ownership test.
 Stencil test.
 Alpha test.
 Blending.
 Antialiasing.
 Etc.
