Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

E

Early restart, miss penalty reduction, 86

Earth Simulator, L-46, L-48, L-63

EBS See Elastic Block Storage (EBS)

EC2 See Amazon Elastic Computer Cloud (EC2)

ECC See Error-Correcting Code (ECC)

Eckert, J. Presper, L-2 to L-3, L-5, L-19

Eckert-Mauchly Computer Corporation, L-4 to L-5, L-56

ECL minicomputer, L-19

Economies of scale

WSC vs. datacenter costs, 455–456

WSCs, 434

EDSAC (Electronic Delay Storage Automatic Calculator), L-3

EDVAC (Electronic Discrete Variable Automatic Computer), L-2 to L-3

EEMBC See Electronic Design News Embedded Microprocessor Benchmark Consortium (EEMBC)

EEPROM (Electronically Erasable Programmable Read-Only Memory)

compiler-code size considerations, A-44

Flash Memory, 102–104

memory hierarchy design, 72

Effective address

ALU, C-7, C-33

data dependences, 152

definition, A-9

execution/effective address cycle, C-6, C-31 to C-32, C-63

hardware-based speculation, 186, 190, 192

load interlocks, C-39

load-store, 174, 176, C-4

RISC instruction set, C-4 to C-5

simple MIPS implementation, C-31 to C-32

simple RISC implementation, C-6

TLB, B-49

Tomasulo’s algorithm, 173, 178, 182

Effective bandwidth

definition, F-13

example calculations, F-18

vs. interconnected nodes, F-28

interconnection networks

multi-device networks, F-25 to F-29

two-device networks, F-12 to F-20

vs. packet size, F-19

Efficiency factor, F-52

Eight-way set associativity

ARM Cortex-A8, 114

cache optimization, B-29

conflict misses, B-23

data cache misses, B-10

Elapsed time, execution time, 36

Elastic Block Storage (EBS), MapReduce cost calculations, 458–460, 459

Electronically Erasable Programmable Read-Only Memory See EEPROM (Electronically Erasable Programmable Read-Only Memory)

Electronic Delay Storage Automatic Calculator (EDSAC), L-3

Electronic Design News Embedded Microprocessor Benchmark Consortium (EEMBC)

benchmark classes, E-12

ISA code size, A-44

kernel suites, E-12

performance benchmarks, 38

power consumption and efficiency metrics, E-13

Electronic Discrete Variable Automatic Computer (EDVAC), L-2 to L-3

Electronic Numerical Integrator and Calculator (ENIAC), L-2 to L-3, L-5 to L-6, L-77

Element group, definition, 272

Embedded multiprocessors, characteristics, E-14 to E-15

Embedded systems

benchmarks

basic considerations, E-12

power consumption and efficiency, E-13

cell phone case study

Nokia circuit board, E-24

overview, E-20

phone block diagram, E-23

phone characteristics, E-22 to E-24

radio receiver, E-23

standards and evolution, E-25

wireless networks, E-21 to E-22

characteristics, 8–9, E-4

as computer class, 5

digital signal processors

definition, E-3

desktop multimedia support, E-11

examples and characteristics, E-6

media extensions, E-10 to E-11

overview, E-5 to E-7

TI TMS320C6x, E-8 to E-10

TI TMS320C6x instruction packet, E-10

TI TMS320C55, E-6 to E-7, E-7 to E-8

TI TMS320C64x, E-9

EEMBC benchmark suite, E-12

overview, E-2

performance, E-13 to E-14

real-time processing, E-3 to E-5

RISC systems

addressing modes, K-6

addressing modes and instruction formats, K-5 to K-6

arithmetic/logical instructions, K-24

conditional branches, K-17

constant extension, K-9

control instructions, K-16

conventions, K-16

data transfer instructions, K-14, K-23

DSP extensions, K-19

examples, K-3, K-4

instruction formats, K-8

multiply-accumulate, K-20

Sanyo digital camera SOC, E-20

Sanyo VPC-SX500 digital camera case study, E-19

Sony PlayStation 2 block diagram, E-16

Sony PlayStation 2 Emotion Engine case study, E-15 to E-18

Sony PlayStation 2 Emotion Engine organization, E-18

EMC, L-80

Emotion Engine

organization modes, E-18

Sony PlayStation 2 case study, E-15 to E-18

empowerTel Networks, MXP processor, E-14

Encoding

control flow instructions, A-18

erasure encoding, 439

instruction set, A-21 to A-24, A-22

Intel 80x86 instructions, K-55, K-58

ISAs, 14, A-5 to A-6

MIPS ISA, A-33

MIPS pipeline, C-36

opcode, A-13

VAX instructions, K-68 to K-70, K-69

VLIW model, 195–196

Encore Multimax, L-59

End-to-end flow control

congestion management, F-65

vs. network-only features, F-94 to F-95

Energy efficiency See also Power consumption

Climate Savers Computing Initiative, 462

embedded benchmarks, E-13

hardward fallacies, 56

ILP exploitation, 201

Intel Core i7, 401–405

ISA, 241–243

microprocessor, 23–26

PMDs, 6

processor performance equation, 52

servers, 25

and speculation, 211–212

system trends, 21–23

WSC, measurement, 450–452

WSC goals/requirements, 433

WSC infrastructure, 447–449

WSC servers, 462–464

Energy proportionality, WSC servers, 462

Engineering Research Associates (ERA), L-4 to L-5

ENIAC (Electronic Numerical Integrator and Calculator), L-2 to L-3, L-5 to L-6, L-77

Enigma coding machine, L-4

Entry time, transactions, D-16, D-17

Environmental faults, storage systems, D-11

EPIC approach

historical background, L-32

IA-64, H-33

VLIW processors, 194, 196

Equal condition code, PowerPC, K-10 to K-11

ERA See Engineering Research Associates (ERA)

Erasure encoding, WSCs, 439

Error-Correcting Code (ECC)

disk storage, D-11

fault detection pitfalls, 58

Fermi GPU architecture, 307

hardware dependability, D-15

memory dependability, 104

RAID 2, D-6

and WSCs, 473–474

Error handling, interconnection networks, F-12

Errors, definition, D-10 to D-11

Escape resource set, F-47

ETA processor, vector processor history, G-26 to G-27

Ethernet

and bandwidth, F-78

commercial interconnection networks, F-63

cross-company interoperability, F-64

interconnection networks, F-89

as LAN, F-77 to F-79

LAN history, F-99

LANs, F-4

packet format, F-75

shared-media networks, F-23

shared- vs. switched-media networks, F-22

storage area network history, F-102

switch vs. NIC, F-86

system area networks, F-100

total time statistics, F-90

WAN history, F-98

Ethernet switches

architecture considerations, 16

Dell servers, 53

Google WSC, 464–465, 469

historical performance milestones, 20

WSCs, 441–444

European Center for Particle Research (CERN), F-98

Even/odd array

example, J-52

integer multiplication, J-52

EVEN-ODD scheme, development, D-10

EX See Execution address cycle (EX)

Example calculations

average memory access time, B-16 to B-17

barrier synchronization, I-15

block size and average memory access time, B-26 to B-28

branch predictors, 164

branch schemes, C-25 to C-26

branch-target buffer branch penalty, 205–206

bundles, H-35 to H-36

cache behavior impact, B-18, B-21

cache hits, B-5

cache misses, 83–84, 93–95

cache organization impact, B-19 to B-20

carry-lookahead adder, J-39

chime approximation, G-2

compiler-based speculation, H-29 to H-31

conditional instructions, H-23 to H-24

CPI and FP, 50–51

credit-based control flow, F-10 to F-11

crossbar switch interconnections, F-31 to F-32

data dependences, H-3 to H-4

DAXPY on VMIPS, G-18 to G-20

dependence analysis, H-7 to H-8

deterministic vs. adaptive routing, F-52 to F-55

dies, 29

die yield, 31

dimension-order routing, F-47 to F-48

disk subsystem failure rates, 48

fault tolerance, F-68

fetch-and-increment barrier, I-20 to I-21

FFT, I-27 to I-29

fixed-point arithmetic, E-5 to E-6

floating-point addition, J-24 to J-25

floating-point square root, 47–48

GCD test, 319, H-7

geometric means, 43–44

hardware-based speculation, 200–201

inclusion, 397

information tables, 176–177

integer multiplication, J-9

interconnecting node costs, F-35

interconnection network latency and effective bandwidth, F-26 to F-28

I/O system utilization, D-26

L1 cache speed, 80

large-scale multiprocessor locks, I-20

large-scale multiprocessor synchronization, I-12 to I-13

loop-carried dependences, 316, H-4 to H-5

loop-level parallelism, 317

loop-level parallelism dependences, 320

loop unrolling, 158–160

MapReduce cost on EC2, 458–460

memory banks, 276

microprocessor dynamic energy/power, 23

MIPS/VMIPS for DAXPY loop, 267–268

miss penalty, B-33 to B-34

miss rates, B-6, B-31 to B-32

miss rates and cache sizes, B-29 to B-30

miss support, 85

M/M/1 model, D-33

MTTF, 34–35

multimedia instruction compiler support, A-31 to A-32

multiplication algorithm, J-19

network effective bandwidth, F-18

network topologies, F-41 to F-43

Ocean application, I-11 to I-12

packet latency, F-14 to F-15

parallel processing, 349–350, I-33 to I-34

pipeline execution rate, C-10 to C-11

pipeline structural hazards, C-14 to C-15

power-performance benchmarks, 439–440

predicated instructions, H-25

processor performance comparison, 218–219

queue I/O requests, D-29

queue waiting time, D-28 to D-29

queuing, D-31

radix-4 SRT division, J-56

redundant power supply reliability, 35

ROB commit, 187

ROB instructions, 189

scoreboarding, C-77

sequential consistency, 393

server costs, 454–455

server power, 463

signed-digit numbers, J-53

signed numbers, J-7

SIMD multimedia instructions, 284–285

single-precision numbers, J-15, J-17

software pipelining, H-13 to H-14

speedup, 47

status tables, 178

strides, 279

TB-80 cluster MTTF, D-41

TB-80 IOPS, D-39 to D-40

torus topology interconnections, F-36 to F-38

true sharing misses and false sharing, 366–367

VAX instructions, K-67

vector memory systems, G-9

vector performance, G-8

vector vs. scalar operation, G-19

vector sequence chimes, 270

VLIW processors, 195

VMIPS vector operation, G-6 to G-7

way selection, 82

write buffer and read misses, B-35 to B-36

write vs. no-write allocate, B-12

WSC memory latency, 445

WSC running service availability, 434–435

WSC server data transfer, 446

Exceptions

ALU instructions, C-4

architecture-specific examples, C-44

categories, C-46

control dependence, 154–155

floating-point arithmetic, J-34 to J-35

hardware-based speculation, 190

imprecise, 169–170, 188

long latency pipelines, C-55

MIPS, C-48, C-48 to C-49

out-of-order completion, 169–170

precise, C-47, C-58 to C-60

preservation via hardward support, H-28 to H-32

return address buffer, 207

ROB instructions, 190

speculative execution, 222

stopping/restarting, C-46 to C-47

types and requirements, C-43 to C-46

Execute step

instruction steps, 174

Itanium 2, H-42

ROB instruction, 186

TI 320C55 DSP, E-7

Execution address cycle (EX)

basic MIPS pipeline, C-36

data hazards requiring stalls, C-21

data hazard stall minimization, C-17

exception stopping/restarting, C-46 to C-47

hazards and forwarding, C-56 to C-57

MIPS FP operations, basic considerations, C-51 to C-53

MIPS pipeline, C-52

MIPS pipeline control, C-36 to C-39

MIPS R4000, C-63 to C-64, C-64

MIPS scoreboarding, C-72, C-74, C-77

out-of-order execution, C-71

pipeline branch issues, C-40, C-42

RISC classic pipeline, C-10

simple MIPS implementation, C-31 to C-32

simple RISC implementation, C-6

Execution time

Amdahl’s law, 46–47, 406

application/OS misses, B-59

cache performance, B-3 to B-4, B-16

calculation, 36

commercial workloads, 369–370, 370

energy efficiency, 211

integrated circuits, 22

loop unrolling, 160

multilevel caches, B-32 to B-34

multiprocessor performance, 405–406

multiprogrammed parallel “make” workload, 375

multithreading, 232

performance equations, B-22

pipelining performance, C-3, C-10 to C-11

PMDs, 6

principle of locality, 45

processor comparisons, 243

processor performance equation, 49, 51

reduction, B-19

second-level cache size, B-34

SPEC benchmarks, 42–44, 43, 56

and stall time, B-21

vector length, G-7

vector mask registers, 276

vector operations, 268–271

Expand-down field, B-53

Explicit operands, ISA classifications, A-3 to A-4

Explicit parallelism, IA-64, H-34 to H-35

Explicit unit stride, GPUs vs. vector architectures, 310

Exponential back-off

large-scale multiprocessor synchronization, I-17

spin lock, I-17

Exponential distribution, definition, D-27

Extended accumulator

flawed architectures, A-44

ISA classification, A-3

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Computer Architecture: A Quantitative Approach

Create new playlist

Sign In

Sign Up

E

Table of Contents for
Computer Architecture: A Quantitative Approach