7.2TheSpatiotemporalFramework 107
float4 out = freshData;
out.a = ActiveFrame.w;
return (out);
}
Listing 7.2. Simplified reprojection cache.
BilateralFiltering
Bilateral filtering is conceptually similar to bilateral upsampling. We perform
Gaussian filtering with weights influenced by a geometric similarity function
[Tomasi and Manduchi 1998]. We can treat it as an edge-aware smoothing filter
or a high-order reconstruction filter utilizing spatial coherence. Bilateral filtering
proves to be extremely efficient for content-aware data smoothing. Moreover,
with only insignificant artifacts, a bilateral filter can be separated into two direc-
tions, leading to

On running time. We use it for any kind of slowly-varying
data, such as ambient occlusion or shadows, that needs to be aware of scene ge-
ometry. Moreover, we use it to compensate for undersampled pixels. When a
pixel lacks samples, lacks history data, or has missed the cache, it is reconstruct-
ed from spatial coherency data. That solution leads to more plausible results
compared to relying on temporal data only. Listing 7.3 shows a separable, depth-
aware bilateral filter that uses hardware linear filtering.
float Bilateral3D5x5(sampler2D inSampler, float2 texelSize,
float2 UV, float2 Dir)
{
const float centerWeight = 0.402619947;
const float4 tapOffsets = float4(-3.5, -1.5, 1.5, 3.5);
const float4 tapWeights = float4(0.054488685, 0.244201342,
0.244201342, 0.054488685);
const float E = 1.0;
const float diffAmp = IN_BilateralFilterAmp;
float2 color;
float4 pSamples, nSamples;
float4 diffIp, diffIn;
float4 pTaps[2];
float2 offSize = Dir * texelSize;
108 7.ASpatialandTemporalCoherenceFrameworkforRealTimeGraphics
pTaps[0] = UV.xyxy + tapOffsets.xxyy * offSize.xyxy;
pTaps[1] = UV.xyxy + tapOffsets.zzww * offSize.xyxy;
color = tex2D(inSampler, UV.xy).ra;
// r – contains data to be filtered
// a – geometry depth
pTaps[0].xy = tex2D(inSampler, pTaps[0].xy).ra;
pTaps[0].zw = tex2D(inSampler, pTaps[0].zw).ra;
pTaps[1].xy = tex2D(inSampler, pTaps[1].xy).ra;
pTaps[1].zw = tex2D(inSampler, pTaps[1].zw).ra;
float4 centralD = color.y;
diffIp = (1.0 / (E + diffAmp * abs(centralD - float4(pTaps[0].y,
pTaps[0].w, pTaps[1].y, pTaps[1].w)))) * tapWeights;
float Wp = 1.0 / (dot(diffIp, 1) + centerWeight);
color.r *= centerWeight;
color.r = Wp * (dot(diffIp, float4(pTaps[0].x, pTaps[0].z,
pTaps[1].x, pTaps[1].z)) + color.r);
return (color.r);
}
Listing 7.3. Directional bilateral filter working with depth data.
SpatiotemporalCoherency
We would like to combine the described techniques to take advantage of the spa-
tiotemporal coherency in the data. Our default framework works in several steps:
1. Depending on the data, caching is performed at lower resolution.
2. We operate with the history buffer (HB) and the current buffer (CB).
3. The CB is computed with a small set of current samples.
4. Samples from the HB are accumulated in the CB by means of reprojection
caching.
5. A per-pixel convergence factor is saved for further processing.
7.3Applications 109
6. The CB is bilaterally filtered with a higher smoothing rate for pixels with a
lower convergence rate to compensate for smaller numbers of samples or
cache misses.
7. The CB is bilaterally upsampled to the original resolution for further use.
8. The CB is swapped with the HB.
The buffer format and processing steps differ among specific applications.
7.3Applications
Our engine is composed of several complex pixel-processing stages that include
screen-space ambient occlusion, screen-space soft shadows, subsurface scattering
for skin shading, volumetric effects, and a post-processing pipeline with depth of
field and motion blur. We use the spatiotemporal framework to accelerate most
of those stages in order to get the engine running at production-quality speeds on
current-generation consoles.
ScreenSpaceAmbientOcclusion
Ambient occlusion AO is computed by integrating the visibility function over a
hemisphere H with respect to a projected solid angle, as follows:

,
1
H
A
OV d
π

p ω
N ωω
,
where
N is the surface normal and
,
V
p
ω
is the visibility function at p (such that
,
0V
p ω
when occluded in the direction
ω
, and
,
1V
p ω
otherwise). It can be effi-
ciently computed in screen space by multiple occlusion checks that sample the
depth buffer around the point being shaded. However, it is extremely taxing on
the GPU due to the high sample count and large kernels that trash the texture
cache. On current-generation consoles, it seems impractical to use more than
eight samples. In our case, we could not even afford that many because, at the
time, we had only two milliseconds left in our frame time budget.
After applying the spatiotemporal framework, we could get away with only
four samples per frame, and we achieved even higher quality than before due to
amortization over time. We computed the SSAO at half resolution and used bi-
lateral upsampling during the final frame combination pass. For each frame, we
changed the SSAO kernel sampling pattern, and we took care to generate a uni-
formly distributed pattern in order to minimize frame-to-frame inconsistencies.
110
Due to
m
mation,
l
Further
m
motion
v
questio
n
miss de
t
gence b
a
position
.
tional p
r
b
ilateral
l
only).
P
while ot
h
is worth
Therefo
r
without
second
o
Figure 7
Figure 7
.
umn sho
w
7.ASp
m
emory cons
t
l
eaving the s
u
m
ore, since
w
v
ectors, so a
n
n
. During cac
t
ection algori
t
a
sed on the
d
.
That policy
r
ocessing ste
p
l
y filtered, t
a
P
ixels with
h
h
ers were re
c
noticing tha
t
r
e, we were
f
significant q
u
o
f GPU time
.7 shows fin
a
.
7. The left co
l
w
s our final X
b
atialandTe
m
t
raints on co
n
u
rface norma
l
w
e used only
c
n
additional p
a
hing, we res
o
t
hm compen
s
d
istance bet
w
tended to gi
v
p
s involved.
A
a
king conve
r
h
igh tempora
c
onstructed s
p
t
we were s
w
f
iltering ove
r
u
ality loss.
T
and enabled
a
l results co
mp
l
umn shows S
S
b
ox 360 imple
m
m
poralCoher
e
n
soles, we d
e
l
vectors ava
i
c
amera-
b
ase
d
a
ss for
m
otio
n
o
r
t
ed to cam
s
ated for tha
t
w
een a histo
r
v
e good resul
t
A
fter reproje
c
r
gence into
c
l confidence
p
atially depe
n
w
itching hist
o
r
time, whic
h
T
he complete
us to use SS
A
p
ared to the
d
S
AO without
u
m
entation.
e
nceFrame
w
e
cided to rel
y
i
lable only fo
r
d
motion blu
r
n
field comp
u
m
era reproject
i
t
by calculati
n
r
y sample an
d
t
s, especially
c
tion, ambie
n
c
onsideration
retained hi
g
n
ding on the
c
o
ry buffe
r
s a
f
h
enables us
solution req
A
O in real ti
m
d
efault algori
t
u
sing spatial c
o
w
orkforReal
y
only on de
p
r
SSAO com
p
r
, we lacked
p
u
tation was
o
ion only. O
u
n
g a runnin
g
d
the predic
t
considering
t
n
t occlusion
d
when avail
a
g
h-frequenc
y
c
onvergence
f
f
ter bilateral
f
to use smal
l
q
uired only o
n
me on the X
b
thm.
o
herency. The
TimeGraphi
c
p
th infor-
p
utation.
p
e
r
-pixel
o
ut of the
u
r cache-
g
conver-
t
ed valid
t
he addi-
d
ata was
a
ble (PC
y
details,
f
actor. It
f
iltering.
l
kernels
n
e milli-
b
ox 360.
right col-
c
s
7.3
A
A
pplicatio ns
Sof
t
Our
fra
m
visi
b
b
uf
fe
perc
dist
r
200
7
ilar
t
spa
c
7.9
s
Fi
gu
filte
r
of th
t
Shadows
shadowing s
m
ework for s
u
b
le all the ti
m
fe
r. While sh
a
entage close
r
r
ibuted samp
l
7
]. Reproject
i
t
o our SSAO
c
e and bilater
s
how our fin
a
u
re 7.8. Lever
a
r
ed, look free
o
e original sce
n
olution work
u
n shadows
o
m
e. Firs
t
, we
a
dow testing
r
filter. For e
a
l
e set in ord
e
i
on caching a
solution. Th
e
ally upsamp
l
a
l results for t
h
a
ging the spati
o
o
f undersampl
i
n
e (top).
s in a deferr
e
nly since th
o
draw sun s
h
against a ca
s
a
ch frame,
w
er
to leverag
e
c
cumulates t
h
e
n the shado
w
ed for the fi
n
h
e Xbox 360
o
temporal coh
e
i
ng artifac
t
s,
w
e
d manner.
W
o
se are comp
u
h
adows to an
s
caded shado
w
w
e use a diff
e
e
temporal c
o
h
e samples o
v
w
buffer is bi
l
n
al composi
t
i
implementat
er
ency of shad
o
w
ithout
r
aising
W
e use the sp
u
tationally e
x
offscreen lo
w map, we
u
e
rent sample
f
o
herence [S
c
v
er
t
ime in a
l
aterally filte
r
i
on pass. Fig
t
ion.
o
ws (bottom)
e
the shadow
m
atiotemporal
x
pensive and
w-resolution
u
se a custom
f
rom a well-
c
herzer et al.
manne
r
sim-
r
ed in screen
ures 7.8 and
e
nables a soft,
m
ap resolution
111
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset