112
Figure
7
flickerin
g
Shado
w
Since o
u
b
ient oc
c
tion run
n
b
uffer,
a
depth o
f
shadowi
n
floating
-
the inte
g
these va
l
Eve
r
shadowi
n
tions. T
h
main de
f
perform
a
ble 7.1.
7.ASp
7
.9. The s
p
ati
o
g
artifacts (rig
h
w
sandAm
b
u
r shadowing
c
lusion, we i
n
n
ing on con
s
a
nd it is store
d
f
the four u
n
n
g and ambi
e
-
point value
i
g
er part holds
l
ues are sho
w
r
y step of t
h
n
g and ambi
e
h
e last step o
f
erred shadin
a
nce gained
b
atialandTe
m
o
temporal fra
m
h
t) that appear
i
b
ientOccl
u
pipeline is s
i
n
tegrate
b
oth
s
oles. Our hi
s
d
in RG16F
f
n
derlying pix
e
e
nt occlusio
n
i
s used for o
c
the shadowi
n
w
n in Listing
7
h
e spatiotem
p
e
nt occlusion
f the frame
w
g pass. Figu
r
b
y using ou
r
m
poralCoher
e
m
ework efficie
i
n the original
u
sionCom
b
i
milar to the
into one pas
s
s
tory buffer
i
f
ormat. The
g
e
ls in the Z
-
n
information
c
clusion
b
ec
a
n
g factor. Fu
n
7
.4.
p
oral frame
w
values usin
g
w
ork is bilate
r
r
e 7.10 show
s
r
technique
o
e
nceFrame
w
e
ntly handles
s
scene (left).
b
ined
one used du
r
s
in our most
e
is half the r
e
g
reen channe
l
-
buffer. The
n
. The fractio
n
a
use it requi
r
n
ctions for p
a
w
ork runs in
g
the packing
r
al upsampli
n
s
an overvie
w
o
n the Xbox
w
orkforReal
s
hadow acne
a
r
ing screen-s
p
efficient imp
l
e
solution of
t
l stores the
m
red channel
n
al part of t
h
r
es more var
i
a
cking and u
n
parallel on
b
and unpacki
n
g combined
w
of the pipe
l
360 is show
n
TimeGraphi
c
a
nd other
p
ace am-
l
emen
t
a-
t
he back
m
inimum
contains
h
e 16-bit
i
ety, and
n
packing
b
oth the
ng func-
with the
l
ine. The
n
in Ta-
c
s
7.3Applications 113
#define PACK_RANGE 31.0
#define MIN_FLT 0.01
float PackOccShadow(float Occ, float Shadow)
{
return (floor(saturate(Occ) * PACK_RANGE) +
clamp(Shadow, MIN_FLT, 1.0 - MIN_FLT));
}
float2 UnpackOccShadow(float OccShadow)
{
return (float2((floor(OccShadow)) / PACK_RANGE, frac(OccShadow)));
}
Listing 7.4. Code for shadow and occlusion data packing and unpacking.
Figure 7.10. Schematic diagram of our spatiotemporal framework used with SSAO and
shadows.
Shadow buffer
generation
Min depth
rewrite
SSAO
generation
Shadow and
AO packing
Reprojection
caching
Separable
bilateral
filtering
Bilateral
upsampling
during main
deferred
shading pass
Shadow map
Depth buffer
Volumetric
noise
Normal buffer
Shadow + AO
Min depth
Shadow + AO
Min depth
Shadow + AO
Min depth
Shadow buffer
Min depth
History buffer
Shadow + AO
Min depth
History buffer
114 7.ASpatialandTemporalCoherenceFrameworkforRealTimeGraphics
Stage ST Framework Reference
Shadows 0.7 ms 3.9 ms
SSAO generation 1.1 ms 3.8 ms
Reprojection caching 0.35 ms
Bilateral filtering 0.42 ms (0.2 ms per pass)
Bilateral upsampling 0.7 ms 0.7 ms
Total 3.27 ms 8.4 ms
Table 7.1. Performance comparison of various stages and a reference solution in which
shadowing is performed in full resolution with
22
jittered PCF, and SSAO uses 12 taps
and upsampling. The spatiotemporal (ST) framework is 2.5 times faster than the refer-
ence solution and still yields better image quality.
Postprocessing
Several postprocessing effects, such as depth of field and motion blur, tend to
have high spatial and temporal coherency. Both can be expressed as a multisam-
pling problem in time and space and are, therefore, perfectly suited for our
framework. Moreover, the mixed frequency nature of both effects tends to hide
any possible artifacts. During our tests, we were able to perform production-
ready postprocessing twice as fast as with a normal non-cached approach.
Additionally, blurring is an excellent candidate for use with the spatiotem-
poral framework. Normally, when dealing with extremely large blur kernels, hi-
erarchical downsampling with filtering must be used in order to reach reasonable
performance with enough stability in high-frequency detail. Using importance
sampling for downsampling and blurring with the spatiotemporal framework, we
are able to perform high-quality Gaussian blur, using radii reaching 128 pixels in
a 720p frame, with no significant performance penalty (less than 0.2 ms on the
Xbox 360). The final quality is shown in Figure 7.11.
First, we sample nine points with linear filtering and importance sampling in
a single downscaling pass to 1/64 of the screen size. Stability is sustained by the
reprojection caching, with different subsets of samples used during each frame.
The resulting image is blurred, cached, and upsampled. Bilateral filtering is used
when needed by the application (e.g., for depth-of-field simulation where geome-
try awareness is required).
7.4FutureWork
Fi
gu
Gau
s
p
roc
e
fram
7.4Fut
u
The
r
we
p
to p
r
con
c
to p
r
An
t
The
(FS
A
defe
co
mp
b
ig
Wh
e
anti
a
u
re 7.11. The
b
s
sian blur use
d
e
ss is efficie
n
ework.
u
reWork
r
e are several
p
erformed ex
p
r
oject deadli
n
c
epts were no
r
esent our fin
d
t
ialiasing
spatiotempo
r
A
A) at a reas
o
r
red renderer
p
utation at a
as the o
r
igi
n
e
n enough p
r
a
liasing sche
m
b
ottom image
d
for volumetri
c
n
t and stable
o
interesting
c
p
eriments th
a
n
es, addition
a
t implement
e
d
ings here a
n
r
al framewor
k
o
nable perfo
r
s, we normal
l
higher resol
u
n
al frame
b
u
f
r
ocessing po
w
m
es are prefe
r
shows the res
u
c
water effects
o
n the Xbox
c
oncep
t
s that
u
a
t produced s
u
a
l memory r
e
e
d in the final
n
d improve u
p
k
is also eas
i
r
mance and
m
l
y have to re
n
u
tion. In gen
e
f
fer in
b
oth
t
w
er and me
m
r
red.
ult of applyin
g
to the scene s
h
360 using th
e
u
se the spati
o
u
rprisingly g
o
e
quirements,
a
iteration of t
h
p
on them in t
h
i
ly extended
m
emory cost
n
der the G-
bu
e
ral, FSAA b
u
t
he horizont
a
m
ory are av
a
g
a large-kern
hown in the t
o
e
spatiotempo
r
o
temporal co
h
o
od results.
H
a
nd lack of t
e
t
he engine.
W
h
e future.
to full-scen
e
[Yang et. Al
u
ffer and per
f
u
ffers tend t
o
a
l and vertic
a
a
ilable, high
e
el (128-pixel)
o
p image. This
r
al coherency
h
erency, and
H
owever, due
esting, those
W
e would like
e
antialiasing
2009]. With
f
orm lighting
o
be twice as
a
l directions.
er
-resolution
115
116 7.ASpatialandTemporalCoherenceFrameworkforRealTimeGraphics
The last stage of antialiasing is the downsampling process, which generates
stable, artifact-free, edge-smoothed images. Each pixel of the final frame buffer
is an average of its subsamples in the FSAA buffer. Therefore, we can easily re-
construct the valid value by looking back in time for subsamples. In our experi-
ment, we wanted to achieve 4X FSAA. We rendered each frame with a subpixel
offset, which can be achieved by manipulating the projection matrix. We as-
sumed that four consecutive frames hold the different subsamples that would
normally be available in 4X FSAA, and we used reprojection to integrate those
subsamples over time. When a sample was not valid, due to unocclusion, we re-
jected it. When misses occurred, we could also perform bilateral filtering with
valid samples to leverage spatial coherency.
Our solution proved to be efficient and effective, giving results comparable
to 4X FSAA for near-static scenes and giving results of varying quality during
high-frequency motion. However, pixels in motion were subject to motion blur,
which effectively masked any artifacts produced by our antialiasing solution. In
general, the method definitely proved to be better than 2X FSAA and slightly
worse than 4X FSAA since some high-frequency detail was lost due to repeated
resampling. Furthermore, the computational cost was insignificant compared to
standard FSAA, not to mention that it has lower memory requirements (only one
additional full-resolution buffer for caching). We would like to improve upon
resampling schemes to avoid additional blurring.
HighQualitySpatiotemporalReconstruction
We would like to present another concept to which the spatiotemporal framework
can be applied. It is similar to the one used in antialiasing. Suppose we want to
draw a full-resolution frame. During each frame, we draw a
1
n-resolution buffer,
called the refresh buffer, with a different pixel offset. We change the pattern for
each frame in order to cover the full frame of information in n frames. The final
image is computed from the refresh buffer and a high-resolution history buffer.
When the pixel being processed is not available in the history or refresh buffer,
we resort to bilateral upsampling from coarse samples. See Figure 7.12 for an
overview of the algorithm. This solution speeds up frame computation by a factor
of n, producing a properly resampled high-resolution image, with the worst-case
per-pixel resolution being
1
n of the original. Resolution loss would be mostly
visible near screen boundaries and near fast-moving objects. However, those arti-
facts may be easily masked by additional processing, like motion blur. We found
that setting 4n generally leads to an acceptable solution in terms of quality and
performance. However, a strict rejection and bilateral upsampling policy must be
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset