Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

G

Gateways, Ethernet, F-79

Gather-Scatter

definition, 309

GPU comparisons, 329

multimedia instruction compiler support, A-31

sparse matrices, G-13 to G-14

vector architectures, 279–280

GCD See Greatest common divisor (GCD) test

GDDR See Graphics double data rate (GDDR)

GDRAM See Graphics dynamic random-access memory (GDRAM)

GE 645, L-9

General-Purpose Computing on GPUs (GPGPU), L-51 to L-52

General-purpose electronic computers, historical background, L-2 to L-4

General-purpose registers (GPRs)

advantages/disadvantages, A-6

IA-64, H-38

Intel 80x86, K-48

ISA classification, A-3 to A-5

MIPS data transfers, A-34

MIPS operations, A-36

MIPS64, A-34

VMIPS, 265

GENI See Global Environment for Network Innovation (GENI)

Geometric means, example calculations, 43–44

GFS See Google File System (GFS)

Gibson mix, L-6

Giga Thread Engine, definition, 292, 314

Global address space, segmented virtual memory, B-52

Global code scheduling

example, H-16

parallelism, H-15 to H-23

superblock scheduling, H-21 to H-23, H-22

trace scheduling, H-19 to H-21, H-20

Global common subexpression elimination, compiler structure, A-26

Global data area, and compiler technology, A-27

Global Environment for Network Innovation (GENI), F-98

Global load/store, definition, 309

Global Memory

definition, 292, 314

GPU programming, 290

locks via coherence, 390

Global miss rate

definition, B-31

multilevel caches, B-33

Global optimizations

compilers, A-26, A-29

optimization types, A-28

Global Positioning System, CDMA, E-25

Global predictors

Intel Core i7, 166

tournament predictors, 164–166

Global scheduling, ILP, VLIW processor, 194

Global system for mobile communication (GSM), cell phones, E-25

Goldschmidt’s division algorithm, J-29, J-61

Goldstine, Herman, L-2 to L-3

Google

Bigtable, 438, 441

cloud computing, 455

cluster history, L-62

containers, L-74

MapReduce, 437, 458–459, 459

server CPUs, 440

server power-performance benchmarks, 439–441

WSCs, 432, 449

containers, 464–465, 465

cooling and power, 465–468

monitoring and repairing, 469–470

PUE, 468

servers, 467, 468–469

Google App Engine, L-74

Google Clusters

memory dependability, 104

power consumption, F-85

Google File System (GFS)

MapReduce, 438

WSC storage, 442–443

Google Goggles

PMDs, 6

user experience, 4

Google search

shared-memory workloads, 369

workload demands, 439

Gordon Bell Prize, L-57

GPGPU (General-Purpose Computing on GPUs), L-51 to L-52

GPRs See General-purpose registers (GPRs)

GPU (Graphics Processing Unit)

banked and graphics memory, 322–323

computing history, L-52

definition, 9

DLP

basic considerations, 288

basic PTX thread instructions, 299

conditional branching, 300–303

coprocessor relationship, 330–331

definitions, 309

Fermi GPU architecture innovations, 305–308

Fermi GTX 480 floorplan, 295

GPUs vs. vector architectures, 308–312, 310

mapping examples, 293

Multimedia SIMD comparison, 312

multithreaded SIMD Processor block diagram, 294

NVIDIA computational structures, 291–297

NVIDIA/CUDA and AMD terminology, 313–315

NVIDIA GPU ISA, 298–300

NVIDIA GPU Memory structures, 304, 304–305

programming, 288–291

SIMD thread scheduling, 297

terminology, 292

fine-grained multithreading, 224

future features, 332

gather/scatter operations, 280

historical background, L-50

loop-level parallelism, 150

vs. MIMD with Multimedia SIMD, 324–330

mobile client/server features, 324, 324

power/DLP issues, 322

raw/relative performance, 328

Roofline model, 326

scalable, L-50 to L-51

strided access-TLB interactions, 323

thread count and memory performance, 332

TLP, 346

vector kernel implementation, 334–336

vs. vector processor operation, 276

GPU Memory

caches, 306

CUDA program, 289

definition, 292, 309, 314

future architectures, 333

GPU programming, 288

NVIDIA, 304, 304–305

splitting from main memory, 330

Gradual underflow, J-15, J-36

Grain size

MIMD, 10

TLP, 346

Grant phase, arbitration, F-49

Graph coloring, register allocation, A-26 to A-27

Graphics double data rate (GDDR)

characteristics, 102

Fermi GTX 480 GPU, 295, 324

Graphics dynamic random-access memory (GDRAM)

bandwidth issues, 322–323

characteristics, 102

Graphics-intensive benchmarks, desktop performance, 38

Graphics pipelines, historical background, L-51

Graphics Processing Unit See GPU (Graphics Processing Unit)

Graphics synchronous dynamic random-access memory (GSDRAM), characteristics, 102

Graphics Synthesizer, Sony PlayStation 2, E-16, E-16 to E-17

Greater than condition code, PowerPC, K-10 to K-11

Greatest common divisor (GCD) test, loop-level parallelism dependences, 319, H-7

Grid

arithmetic intensity, 286

CUDA parallelism, 290

definition, 292, 309, 313

and GPU, 291

GPU Memory structures, 304

GPU terms, 308

mapping example, 293

NVIDIA GPU computational structures, 291

SIMD Processors, 295

Thread Blocks, 295

Grid computing, L-73 to L-74

Grid topology

characteristics, F-36

direct networks, F-37

GSDRAM See Graphics synchronous dynamic random-access memory (GSDRAM)

GSM See Global system for mobile communication (GSM)

Guest definition, 108

Guest domains, Xen VM, 111

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Computer Architecture: A Quantitative Approach

Create new playlist

Sign In

Sign Up

G

Table of Contents for
Computer Architecture: A Quantitative Approach