W
Wafers
integrated circuit cost trends,
28–32
Waiting line, definition, D-24
Wait time, shared-media networks, F-23
Wallace tree
historical background, J-63
Wall-clock time
scientific applications on parallel processors, I-33
Warehouse-scale computers (WSCs)
cluster history, L-72 to L-73
computer cluster forerunners,
435–436
facility capital costs,
472
network as bottleneck,
461
physical infrastructure and costs,
446–450
programming models and workloads,
436–441
query response-time curve,
482
SPECPower benchmarks,
463
Warp, L-31
terminology comparison,
314
Warp Scheduler
Multithreaded SIMD Processor,
294
Wavelength division multiplexing (WDM), WAN history, F-98
Way prediction, cache optimization,
81–82
Weak ordering, relaxed consistency models,
395
Weak scaling, Amdahl’s law and parallel computers,
406–407
Web index search, shared-memory workloads,
369
Web servers
benchmarking, D-20 to D-21
dependability benchmarks, D-21
ILP for realizable processors,
218
performance benchmarks,
40
Weighted arithmetic mean time, D-27
Weitek 3364
arithmetic functions, J-58 to J-61
West-first routing, F-47 to F-48
Wide area networks (WANs)
cross-company interoperability, F-64
effective bandwidth, F-18
historical overview, F-97 to F-99
interconnection network domain relationship,
F-4
latency and effective bandwidth, F-26 to F-28
packet latency,
F-13, F-14 to F-16
Window
processor performance calculations,
218
scoreboarding definition,
C-78
Windowing, congestion management, F-65
Window size
ILP for realizable processors,
216–217
Wireless networks
and cell phones, E-21 to E-22
Within instruction exceptions
instruction set complications,
C-50
stopping/restarting execution,
C-46
Word count, definition,
B-53
Word displacement addressing, VAX, K-67
Words
aligned/misaligned addresses,
A-8
AMD Opteron data cache,
B-15
MIPS data transfers,
A-34
MIPS unaligned reads,
K-26
Working set effect, definition, I-24
Workloads
Java and PARSEC without SMT,
403–404
RAID performance prediction, D-57 to D-59
WSC goals/requirements,
433
WSC resource allocation case study,
478–479
Wormhole switching, F-51, F-88
performance issues, F-92 to F-93
system area network history, F-101
Worst-case execution time (WCET), definition, E-4
Write after read (WAR)
dynamic scheduling with Tomasulo’s algorithm,
170–171
hazards and forwarding,
C-55
ILP limitation studies,
220
multiple-issue processors, L-28
register renaming
vs. ROB,
208
Write after write (WAW)
dynamic scheduling with Tomasulo’s algorithm,
170–171
execution sequences,
C-80
ILP limitation studies,
220
microarchitectural techniques case study,
253
multiple-issue processors, L-28
register renaming
vs. ROB,
208
Write allocate
AMD Opteron data cache,
B-12
example calculation,
B-12
Write-back cache
coherence maintenance,
381
directory-based cache coherence,
383,
386
memory hierarchy basics,
75
Write-back cycle (WB)
basic MIPS pipeline,
C-36
data hazard stall minimization,
C-17
execution sequences,
C-80
MIPS pipeline control,
C-39
pipeline branch issues,
C-40
simple MIPS implementation,
C-33
simple RISC implementation,
C-6
Write broadcast protocol, definition,
356
Write buffer
AMD Opteron data cache,
B-14
memory hierarchy basics,
75
write merging example,
88
Write hit
directory-based coherence,
424
single-chip multicore multiprocessor,
414
Write invalidate protocol
directory-based cache coherence protocol example,
382–383
Write merging
miss penalty reduction,
87
Write miss
example calculation,
B-12
memory hierarchy basics,
76–77
memory stall clock cycles,
B-4
snooping cache coherence,
365
write speed calculations,
393
Write result stage
hardware-based speculation,
192
status table examples,
C-77
Write serialization
multiprocessor cache coherency,
353
Write stall, definition,
B-11
Write-through cache
average memory access time,
B-16
memory hierarchy basics,
74–75
Write update protocol, definition,
356