7.1Introduction 99
occlusion, and even global illumination approximations. Renderers have touched
the limit of current-generation home console processing power and bandwidth.
However, expectations are still rising. Therefore, we should focus more on the
overlooked subject of computation and bandwidth compression.
Most pixel-intensive computations, such as shadows, motion blur, depth of
field, and global illumination, exhibit high spatial and temporal coherency. With
ever-increasing resolution requirements, it becomes attractive to utilize those
similarities between pixels [Nehab et al. 2007]. This concept is not new, as it is
the basis for motion picture compression.
If we take a direct stream from our rendering engine and compress it to a lev-
el perceptually comparable with the original, we can achieve a compression ratio
of at least 10:1. What that means is that our rendering engine is calculating huge
amounts of perceptually redundant data. We would like to build upon that.
Video compressors work in two stages. First, the previous frames are ana-
lyzed, resulting in a motion vector field that is spatially compressed. The previ-
ous frame is morphed into the next one using the motion vectors. Differences
between the generated frame and the actual one are computed and encoded again
with compression. Because differences are generally small and movement is
highly stable in time, compression ratios tend to be high. Only keyframes (i.e.,
the first frame after a camera cut) require full information.
We can use the same concept in computer-generated graphics. It seems at-
tractive since we don’t need the analysis stage, and the motion vector field is eas-
ily available. However, computation dependent on the final shaded pixels is not
feasible for current rasterization hardware. Current pixel-processing pipelines
work on a per-triangle basis, which makes it difficult to compute per-pixel differ-
ences or even decide whether the pixel values have changed during the last frame
(as opposed to ray tracing, where this approach is extensively used because of the
per-pixel nature of the rendering). We would like to state the problem in a differ-
ent way.
Most rendering stages’ performance to quality ratio are controlled by the
number of samples used per shaded pixel. Ideally, we would like to reuse as
much data as possible from neighboring pixels in time and space to reduce the
sampling rate required for an optimal solution. Knowing the general behavior of
a stage, we can easily adopt the compression concept. Using a motion vector
field, we can fetch samples over time, and due to the low-frequency behavior, we
can utilize spatial coherency for geometry-aware upsampling. However, there are
several pitfalls to this approach due to the interactive nature of most applications,
particularly video games.