22. GPGPU Cloth Simulation Using GLSL, OpenCL, and CUDA (1/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google



Tait

22.1Int

Thi

gen

tech

that

inde

of e

sphe

arti

GPUC

SL,Op

co Fratarc

s Software

roductio

chapter pro

ric program

ologies are

simulates a

rs, and plane

ch different

re 22.1. A pi

e at interacti

les.

othSi

nCL,a

ngeli

alia



ides a com

ing on the

used for im

iece of clot

(see Figure

echnology i

ce of cloth fa

e rates. The c

ulati

ndCU

arison stud

PU, namel

lementing a

colliding w

22.1). We as

terms of us

ls under the i

loth is compo

nUsin

A

between th

, GLSL, CU

interactive

th simple pr

ess the adva

bility and pe

fluence of gr

ed of 780,00



ee popular

DA, and Op

physically-

mitives like

tages and t

formance.

avity while c

springs con

latforms for

nCL. These

sed method

pheres, cyl-

e drawbacks

lliding with a

ecting 65,000

365

366 22.GPGPUClothSimulationUsing GLSL,OpenCL,andCUDA

Figure 22.2. A

4 grid of particle vertices and the springs for one of the particles.

22.2NumericalAlgorithm

This section provides a brief overview of the theory behind the algorithm used in

computing the cloth simulation. A straightforward way to implement elastic net-

works of particles is by using a mass-spring system. Given a set of evenly spaced

particles on a grid, each particle is connected to its neighbors through simulated

springs, as depicted in Figure 22.2. Each spring applies to the connected particles

a force

spring





spring





ll x



where l represents the current length of the spring (i.e., its magnitude is the dis-

tance between the connected particles),

represents the rest length of the spring

at the beginning of the simulation, k is the stiffness constant,



is the velocity of

the particle, and b is the damping constant. This equation means that a spring

always applies a force that brings the distance between the connected particles

back to its initial rest length. The more the current distance diverges from the rest

length, then the larger is the applied force. This force is damped proportionally to

the current velocity of the particles by the last term in the equation. The blue

springs in Figure 22.2 simulate the stretch stress of the cloth, while the longer red

ones simulate the shear and bend stresses.

For each particle, the numerical algorithm that computes its dynamics is

schematically illustrated in Figure 22.3. For each step of the dynamic simulation,

(i, j)

(i, j + 1)

(i + 1, j − 1)

(i − 2, j − 2)

22.2NumericalAlgorithm 367

Figure 22.3. Numerical algorithm for computing the cloth simulation.

the spring forces and other external forces (e.g., gravity) are applied to the parti-

cles, and then their dynamics are computed according to the Verlet method [Mül-

ler 2008] applied to each particle in the system through the following steps:

   



ΔΔtttttxxx



  

,tttmxFxx

 

















Δ 2 ΔΔtt t tt tt xxxx



Here,





is the current total force applied to the particle, m is the particle mass,







is its acceleration,







is the velocity,





is the current position, and

the time step of the simulation (i.e., how much time the simulation is advanced

for each iteration of the algorithm).

The Verlet method is very popular in real-time applications because it is

simple and fourth-order accurate, meaning that the error for the position compu-

tation is



ΔOt

. This makes the Verlet method two orders of magnitude more

precise than the explicit Euler method, and at the same time, it avoids the compu-

Initial state

Compute forces F(t):

springs and gravity

Compute acceleration

Compute new state

Update state

Handle collisions

 

ttmxF



 

,ttxx





Δ , Δtt ttxx



  

, Δ , Δt t tt tt← xx x x



368 22.GPGPUClothSimulationUsing GLSL,OpenCL,andCUDA

tational cost involved in the Runge-Kutta fourth-order method. In the Verlet

scheme, however, velocity is only first-order accurate; in this case, this is not

really important because velocity is considered only for damping the springs.

22.3CollisionHandling

Generally, collision handling is composed of two phases, collision detection and

collision response. The outcome of collision detection is the set of particles that

are currently colliding with some other primitive. Collision response defines how

these collisions are solved to bring the colliding particles to a legal state (i.e., not

inside a collision primitive). One of the key advantages of the Verlet integration

scheme is the easiness of handling collision response. The position at the next

time step depends only on the current position and the position at the previous

step. The velocity is then estimated by subtracting one from the other. Thus, to

solve a collision, it is sufficient to modify the current position of the colliding

particle to bring it into a legal state, for example, by moving it perpendicularly

out toward the collision surface. The change to the velocity is then handled au-

tomatically by considering this new position. This approach is fast and stable,

even though it remains valid only when the particles do not penetrate too far.

In our cloth simulation, as the state of the particle is being updated, if the

collision test is positive, the particle is displaced into a valid state. For example,

let’s consider a stationary sphere placed into the scene. In this simple case, a col-

lision between the sphere and a particle happens when the following condition is

satisfied:





Δ 0tt r



xc

where

c and r are the center and the radius of the sphere, respectively. If a colli-

sion occurs, then it is handled by moving the particle into a valid state by moving

its position just above the surface of the sphere. In particular, the particle should

be displaced along the normal of the surface at the impact point. The position of

the particle is updated according to the formula





















Δtt r





xcd

where



Δtt





is the updated position after the collision. If the particle does not

penetrate too far,

d can be considered as an acceptable approximation of the

normal to the surface at the impact point.

22.4CPUImplementation 369

22.4CPUImplementation

We first describe the implementation of the algorithm for the CPU as a reference

for the implementations on the GPU described in the following sections.

During the design of an algorithm for the GPU, it is critical to minimize the

amount of data that travels on the main memory bus. The time spent on the bus is

actually one of the primary bottlenecks that strongly penalize performance [Nvid-

ia 2010]. The transfer bandwidth of a standard PCI-express bus is 2 to 8 GB per

second. The internal bus bandwidth of a modern GPU is approximately 100 to

150 GB per second. It is very important, therefore, to minimize the amount of

data that travels on the bus and keep the data on the GPU as much as possible.

In the case of cloth simulation, only the current and the previous positions of

the particles are needed on the GPU. The algorithm computes directly on GPU

the rest distance of the springs and which particles are connected by the springs.

The state of each particle is represented by the following attributes:

1. The current position (four floating-point values).

2. The previous position (four floating-point values).

3. The current normal vector (four floating-point values).

Even though the normal vector is computed during the simulation, it is used

only for rendering purposes and does not affect the simulation dynamics. Here,

the normal vector of a particle is defined to be the average of the normal vectors

of the triangulated faces to which the particle belongs. A different array is created

for storing the current positions, previous positions, and normal vectors. As ex-

plained in later sections of this chapter, for the GPU implementation, these at-

tributes are loaded as textures or buffers into video memory. Each array stores

the attributes for all the particles. The size of each array is equal to the size of an

attribute (four floating-point values) multiplied by the number of particles. For

example, the position of the i-th particle

is stored in the positions array and

accessed as follows:



vec3(in_pos[i * 4], in_pos[i * 4 + 1], in_pos[i * 4 + 2],

in_pos[i * 4 + 3])

The cloth is built as a grid of



particles, where n is the number of parti-

cles composing one side of the grid. Regardless of the value of n, the horizontal

and the vertical spatial dimensions of the grid are always normalized to the range

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 22. GPGPU Cloth Simulation Using GLSL, OpenCL, and CUDA (1/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
22. GPGPU Cloth Simulation Using GLSL, OpenCL, and CUDA (1/3)