2
G
P
G
L
Ma
r
Tait
u
22.1Int
Thi
s
gen
e
tech
n
that
inde
of e
a
Fi
gu
sphe
r
p
arti
c
2
2
P
GPUC
l
L
SL,Op
e
r
co Fratarc
a
u
s Software
It
roductio
n
s
chapter pro
v
e
ric program
m
n
ologies are
simulates a
p
rs, and plane
s
a
ch different
t
u
re 22.1. A pi
e
r
e at interacti
v
c
les.
l
othSi
m
e
nCL,a
a
ngeli
It
alia
n
v
ides a com
p
m
ing on the
G
used for im
p
p
iece of clot
h
s
(see Figure
t
echnology i
n
e
ce of cloth fa
l
v
e rates. The c
m
ulati
o
ndCU
D
p
arison stud
y
G
PU, namel
y
p
lementing a
n
h
colliding w
i
22.1). We as
s
n
terms of us
a
l
ls under the i
n
loth is compo
s
o
nUsin
g
D
A
y
between th
r
y
, GLSL, CU
n
interactive
i
th simple pr
i
s
ess the adva
n
a
bility and pe
r
n
fluence of gr
a
s
ed of 780,00
0
g
r
ee popular
p
U
DA, and Op
e
physically-
ba
i
mitives like
s
n
tages and t
h
r
formance.
avity while c
o
0
springs con
n
p
latforms for
e
nCL. These
a
sed method
s
pheres, cyl-
h
e drawbacks
o
lliding with a
n
ecting 65,000
365
366 22.GPGPUClothSimulationUsing GLSL,OpenCL,andCUDA
Figure 22.2. A
4
4 grid of particle vertices and the springs for one of the particles.
22.2NumericalAlgorithm
This section provides a brief overview of the theory behind the algorithm used in
computing the cloth simulation. A straightforward way to implement elastic net-
works of particles is by using a mass-spring system. Given a set of evenly spaced
particles on a grid, each particle is connected to its neighbors through simulated
springs, as depicted in Figure 22.2. Each spring applies to the connected particles
a force
spring
F
:
spring
kb

0
F
ll x
,
where l represents the current length of the spring (i.e., its magnitude is the dis-
tance between the connected particles),
0
l
represents the rest length of the spring
at the beginning of the simulation, k is the stiffness constant,
x
is the velocity of
the particle, and b is the damping constant. This equation means that a spring
always applies a force that brings the distance between the connected particles
back to its initial rest length. The more the current distance diverges from the rest
length, then the larger is the applied force. This force is damped proportionally to
the current velocity of the particles by the last term in the equation. The blue
springs in Figure 22.2 simulate the stretch stress of the cloth, while the longer red
ones simulate the shear and bend stresses.
For each particle, the numerical algorithm that computes its dynamics is
schematically illustrated in Figure 22.3. For each step of the dynamic simulation,
(i, j)
(i, j + 1)
(i + 1, j 1)
(i 2, j 2)
22.2NumericalAlgorithm 367
Figure 22.3. Numerical algorithm for computing the cloth simulation.
the spring forces and other external forces (e.g., gravity) are applied to the parti-
cles, and then their dynamics are computed according to the Verlet method [Mül-
ler 2008] applied to each particle in the system through the following steps:
1.
 

ΔΔtttttxxx
.
2.
  
,tttmxFxx

.
3.
2
Δ 2 ΔΔtt t tt tt xxxx

.
Here,
t
F
is the current total force applied to the particle, m is the particle mass,
tx

is its acceleration,
tx
is the velocity,
tx
is the current position, and
Δ
t
is
the time step of the simulation (i.e., how much time the simulation is advanced
for each iteration of the algorithm).
The Verlet method is very popular in real-time applications because it is
simple and fourth-order accurate, meaning that the error for the position compu-
tation is

4
ΔOt
. This makes the Verlet method two orders of magnitude more
precise than the explicit Euler method, and at the same time, it avoids the compu-
Initial state
Compute forces F(t):
springs and gravity
Compute acceleration
Compute new state
Update state
Handle collisions
 
ttmxF

 
00
,ttxx

Δ , Δtt ttxx
  
, Δ , Δt t tt tt xx x x

368 22.GPGPUClothSimulationUsing GLSL,OpenCL,andCUDA
tational cost involved in the Runge-Kutta fourth-order method. In the Verlet
scheme, however, velocity is only first-order accurate; in this case, this is not
really important because velocity is considered only for damping the springs.
22.3CollisionHandling
Generally, collision handling is composed of two phases, collision detection and
collision response. The outcome of collision detection is the set of particles that
are currently colliding with some other primitive. Collision response defines how
these collisions are solved to bring the colliding particles to a legal state (i.e., not
inside a collision primitive). One of the key advantages of the Verlet integration
scheme is the easiness of handling collision response. The position at the next
time step depends only on the current position and the position at the previous
step. The velocity is then estimated by subtracting one from the other. Thus, to
solve a collision, it is sufficient to modify the current position of the colliding
particle to bring it into a legal state, for example, by moving it perpendicularly
out toward the collision surface. The change to the velocity is then handled au-
tomatically by considering this new position. This approach is fast and stable,
even though it remains valid only when the particles do not penetrate too far.
In our cloth simulation, as the state of the particle is being updated, if the
collision test is positive, the particle is displaced into a valid state. For example,
let’s consider a stationary sphere placed into the scene. In this simple case, a col-
lision between the sphere and a particle happens when the following condition is
satisfied:
Δ 0tt r
xc
,
where
c and r are the center and the radius of the sphere, respectively. If a colli-
sion occurs, then it is handled by moving the particle into a valid state by moving
its position just above the surface of the sphere. In particular, the particle should
be displaced along the normal of the surface at the impact point. The position of
the particle is updated according to the formula

Δ
Δ
tt
tt
xc
d
xc
,
Δtt r
xcd
,
where

Δtt
x
is the updated position after the collision. If the particle does not
penetrate too far,
d can be considered as an acceptable approximation of the
normal to the surface at the impact point.
22.4CPUImplementation 369
22.4CPUImplementation
We first describe the implementation of the algorithm for the CPU as a reference
for the implementations on the GPU described in the following sections.
During the design of an algorithm for the GPU, it is critical to minimize the
amount of data that travels on the main memory bus. The time spent on the bus is
actually one of the primary bottlenecks that strongly penalize performance [Nvid-
ia 2010]. The transfer bandwidth of a standard PCI-express bus is 2 to 8 GB per
second. The internal bus bandwidth of a modern GPU is approximately 100 to
150 GB per second. It is very important, therefore, to minimize the amount of
data that travels on the bus and keep the data on the GPU as much as possible.
In the case of cloth simulation, only the current and the previous positions of
the particles are needed on the GPU. The algorithm computes directly on GPU
the rest distance of the springs and which particles are connected by the springs.
The state of each particle is represented by the following attributes:
1. The current position (four floating-point values).
2. The previous position (four floating-point values).
3. The current normal vector (four floating-point values).
Even though the normal vector is computed during the simulation, it is used
only for rendering purposes and does not affect the simulation dynamics. Here,
the normal vector of a particle is defined to be the average of the normal vectors
of the triangulated faces to which the particle belongs. A different array is created
for storing the current positions, previous positions, and normal vectors. As ex-
plained in later sections of this chapter, for the GPU implementation, these at-
tributes are loaded as textures or buffers into video memory. Each array stores
the attributes for all the particles. The size of each array is equal to the size of an
attribute (four floating-point values) multiplied by the number of particles. For
example, the position of the i-th particle
i
p
is stored in the positions array and
accessed as follows:
i
p
os
vec3(in_pos[i * 4], in_pos[i * 4 + 1], in_pos[i * 4 + 2],
in_pos[i * 4 + 3])
The cloth is built as a grid of
nn
particles, where n is the number of parti-
cles composing one side of the grid. Regardless of the value of n, the horizontal
and the vertical spatial dimensions of the grid are always normalized to the range
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset