68 4.Screen‐SpaceClassificationforEfficientDeferredShading
the same time as these SPU downsample jobs to maximize parallelization be-
tween the two.
We spread the work across four SPUs. Each SPU job takes 64 6
pixels of
classification data (one main memory frame buffer tile), ORs each
4 pixel
area together to create a
61
block of classification IDs, and streams them
back to main memory. Figure 4.5 shows how output IDs are arranged in main
memory. We take advantage of this block layout to speed up index buffer genera-
tion by coalescing neighboring tiles with the same ID, as explained in Section
4.10. Using
61
tile blocks also allows us to send the results back to main
memory in a single DMA call. Once this SPU work and the depth related classi-
fication work have both finished, a GPU callback triggers SPU jobs to combine
both sets of classification results together and perform the index buffer genera-
tion and draw call patching.
The first part of tile rendering is to fill the command buffer with a series of
shader activates interleaved with enough padding for the draw calls to be inserted
later on, once we know their starting indices and counts. This is done on the CPU
during the render submit phase.
Index buffer generation and tile rendering is spread across four SPUs, where
each SPU runs a single job on a quarter of the screen. The first thing we do is
combine the depth-related classification with the pixel classification. Remember
that we couldn’t do it earlier because the depth-related classification is rendered
on the GPU at the same time as the pixel classification downsample jobs are run-
ning on the SPUs. Once we have final 7-bit IDs, we can create the final draw
calls. Listings 4.5 and 4.6 show how we calculate starting indices and counts for
each shader, and we use these results to patch the command buffer with each
draw call.
Figure 4.5. PlayStation 3 tile classification IDs are arranged in blocks of
616 tiles,
giving us
20 12 blocks in total. The numbers show the memory offsets, not the classifi-
cation IDs.
01 15
16
255
256 257 271
272
496 511240
Block 1 Block 2
...
...
...
...