V
Valid bit
address translation,
B-46
block identification,
B-7
paged virtual memory,
B-56
segmented virtual memory,
B-52
symmetric shared-memory multiprocessors,
366
Value prediction
hardware-based speculation,
192
Variable length encoding
control flow instruction branches,
A-18
Variables
loop-level parallelism,
316
procedure invocation options,
A-19
random, distribution, D-26 to D-34
TLP programmer’s viewpoint,
394
Vector architectures
computer development, L-44 to L-49
DLP
basic considerations,
264
vector load/store unit bandwidth,
276–277
GPU conditional branching,
303
memory systems, G-9 to G-11
multimedia instruction compiler support,
A-31
vs. Multimedia SIMD Extensions,
282
peak performance
vs. start-up overhead,
331
start-up latency and dead time,
G-8
strided access-TLB interactions,
323
vector-register characteristics,
G-3
Vector Functional Unit
vector execution time,
269
vector sequence chimes,
270
Vector Instruction
instruction-level parallelism,
150
Multimedia SIMD Extensions,
282
Thread of Vector Instructions,
292
vector execution time,
269
vector processor example,
268
Vectorizable Loop
Livermore Fortran kernel performance,
331
NVIDIA GPU computational structures,
291
Vectorized code
multimedia compiler support,
A-31
vector architecture programming,
280–282
vector execution time,
271
Vectorizing compilers
effectiveness, G-14 to G-15
FORTRAN test kernels,
G-15
sparse matrices, G-12 to G-13
Vector Lane Registers, definition,
292
Vector-length register (VLR)
Vector-mask control, characteristics,
275–276
Vector Processor
compiler vectorization,
281
DSP media extensions, E-10
historical background, G-26
loop-level parallelism,
150
multiprocessor architecture,
346
NVIDIA GPU computational structures,
291
peak performance focus,
331
performance, G-2 to G-7
start-up and multiple lanes, G-7 to G-9
performance comparison,
58
performance enhancement
DAXPY on VMIPS, G-19 to G-21
sparse matrices, G-12 to G-14
Sony PlayStation 2 Emotion Engine, E-17 to E-18
vector/GPU comparison,
308
vector kernel implementation,
334–336
VMIPS on Linpack, G-17 to G-19
Vector Registers
multimedia compiler support,
A-31
Multimedia SIMD Extensions,
282
performance/bandwidth trade-offs,
332
Very-large-scale integration (VLSI)
early computer arithmetic, J-63
interconnection network topology, F-29
Very Long Instruction Word (VLIW)
compiler scheduling, L-31
loop-level parallelism,
315
multithreading history, L-34
TI 320C6x DSP, E-8 to E-10
Video games, multimedia support, K-17
Virtual address
address translation,
B-46
AMD64 paged virtual memory,
B-55
GPU conditional branching,
303
mapping to physical,
B-45
memory hierarchy basics,
77–78
miss rate
vs. cache size,
B-37
page table-based mapping,
B-45
Virtual channels (VCs), F-47
switch microarchitecture pipelining, F-61
system area network history, F-101
Virtual cut-through switching, F-51
Virtual functions, control flow instructions,
A-18
Virtualizable architecture
system call performance,
141
Virtual Machines support,
109
Virtualizable GPUs, future technology,
333
Virtual machine monitor (VMM)
Virtual Machines ISA support,
109–110
Virtual Machines (VMs)
cloud computing costs,
471
and virtual memory and I/O,
110–111
Virtual memory
fast address translation,
B-46
Multimedia SIMD Extensions,
284
Pentium
vs. Opteron protection,
B-57
strided access-TLB interactions,
323
Virtual methods, control flow instructions,
A-18
Virtual output queues (VOQs), switch microarchitecture, F-60
VME rack
Internet Archive Cluster, D-37
VMIPS
double-precision FP operations,
266
enhanced, DAXPY performance, G-19 to G-21
gather/scatter operations,
280
Multimedia SIMD Extensions,
282
peak performance on DAXPY, G-17
performance on Linpack, G-17 to G-19
vector-length registers,
274
vector load/store unit bandwidth,
276
vector performance measures, G-16
Voltage regulator controller (VRC), Intel SCCC, F-70
Voltage regulator modules (VRMs), WSC server energy efficiency,
462
Volume-cost relationship, components,
27–28
Von Neumann, John, L-2 to L-6
Von Neumann computer, L-3