Index

Symbols/Numbers

* wildcard (SELECT statement), 142–144, 234

2PC (two-phase commit), 138–139

37signals, 10

3PC (three-phase commit), 138

80-20 rule, 10

300 Multiple Choices status code, 77

301 Moved Permanently status code, 77

302 Found status code, 77

303 See Other status code, 77

304 Not Modified status code, 77

305 Use Proxy status code, 78

306 (Unused) status code, 78

307 Temporary Redirect status code, 78

A

ACID properties, 26, 54, 129–130

actions, identifying, 126

aggregating log files, 66–67

“Ajax: A New Approach to Web Applications” (Garrett), 96

Ajax (Asynchronous JavaScript and XML), 95–100, 228

AKF Partners’ D-I-D (Design-Implement-Deploy) approach, 218

deployment, 8–9

design, 7

explained, 6–7

implementation, 8

AKF Scale Cube

explained, 24

illustrated, 24–25

message buses, 185, 188

RFM (recency, frequency, and monetization) analysis, 198

X axis splits, 25–29, 220

Y axis splits, 29–31, 221

Z axis splits, 32, 222

Apache

Hadoop, 59

log files, 66

mod_alias module, 79

mod_expires module, 93

mod_rewrite module, 80

OJB, 108

application caches, 103–107, 229

applications, monitoring, 204–208, 241

archiving, 196–200, 240

The Art of Scalability, 23

asterisk (*) wildcard, 142–144, 234

asynchronous communication, 179–180

advantages of, 180

and fault isolation swimlanes, 152–154

asynchronous completion, 76

message buses

overcrowding, 188–190, 239

scaling, 183, 186, 238

Asynchronous JavaScript and XML (Ajax), 95–100, 228

atomicity, 26

automatic markdowns, 164

avoiding overengineering, 2–5, 218

B

backbones, 88

BASE (Basically Available, Soft State, and Eventually Consistent) architecture, 83

BigTable, 56

Boyce-Codd normal form, 132

browsers, maintaining session data in, 171–173, 237

business intelligence, removing from transaction processing, 201–204, 240

business operations, learning from, 116

C

cache misses, 102

Cache-Control headers, 92–93, 98

caching, 87–88

Ajax calls, 95–100, 228

application caches, 103–107, 229

cache misses, 102

CDNs (content delivery networks), 88–90, 227

distributed cache, 173–176, 237

Expires headers, 91–95, 228

Last-Modified headers, 98

object caches, 107–111, 229–230

page caches, 100–103, 228

Cagan, Marty, 10

Cassandra, 56

CDNs (content delivery networks), 88–90, 227

Ceph, 55

checking work, avoiding, 72–76, 226

circuits

in parallel, 160

in series, 158–159

clauses, FOR UPDATE, 140–142, 234

cloning, 25–29, 220

cloud computing, 48–50, 224

clusters, 148–149, 195

Codd, Edgar F., 26, 54

commands, header(), 93

commodity systems, 39–42, 223

communication. See asynchronous communication

competence, 208–210, 242

competitive differentiation, 75

complexity

avoiding, 2–5, 218

reducing, 9–10, 219

design, 11

implementation, 12

scope, 10–11

config file markdowns, 164

consistency, 26

Constraint Satisfaction Problems (CSP), 82

constraints, temporal, 81–84, 227

content delivery networks (CDNs), 88–90, 227

Cost-Value Data Dilemma, 61

CouchDB, 57

Craigslist, 170

CSP (Constraint Satisfaction Problems), 82

customers, learning from, 115–116

D

D-I-D (Design-Implement-Deploy) approach, 218

deployment, 8–9

design, 7

explained, 6–7

implementation, 8

data centers, scaling out, 42–47, 223

data definition language (DDL), 131

databases

ACID properties, 129–130

alternatives to, 55–61

cloning and replication, 25–29, 220

clustering, 195

entities, 131

locks, 134–137, 233

markdowns, 165

multiphase commits, 137–139, 233

normal forms, 132

normalization, 131

optimizers, 136

relationships, 130–133, 232

SELECT statement

FOR UPDATE clause, 140–142, 234

* wildcard, 142–144, 234

when to use, 54–61, 224

dbquery function, 109

DDL (data definition language), 131

decision flowchart for implementing state, 168

deployment, 8–9

design

D-I-D (Design-Implement-Deploy) approach, 7

designing for fault tolerance

series, 158–162, 235

SPOFs (single points of failure), 155–157, 235

swimlanes (fault isolation), 148–154, 234

Wire On/Wire Off frameworks, 162–163, 166, 236

rollback, 120–123, 231

scaling out, 36–39, 222

simplifying, 11

Design-Implement-Deploy (D-I-D) approach, 6–9

directives, 92

disabling services, 163, 166

distributed cache, 173–176, 237

distributing work

cloning and replication, 25–29, 220

explained, 23–24

separating functionality or services, 29–31, 221

splitting similar data sets across storage and application systems, 32–34, 222

DNS lookups, reducing number of, 12–14, 219

document stores, 57

duplicated work, avoiding, 71

avoiding checking work, 72–76, 226

avoiding redirects, 76–81, 226

relaxing temporal constraints, 81–84, 227

duplication of services/databases, 25–29, 220

durability, 26

E

eBay, 170

edge servers, 88

enabling services, 163, 166

enterprise service buses. See message buses

entities, 131

ERDs (entity relationship diagrams), 131

errors in log files, 68

ETag headers, 102–103

Expires headers, 91–95, 98, 228

ExpiresActive module, 93

explicit locks, 135

extensible record stores, 56

extent locks, 135

F

failures, learning from

designing for rollback, 120–123, 231

importance of, 113–116, 230

postmortem process, 123–127, 232

QA (quality assurance), 117–120, 231

SPOFs (single points of failure), 155–157, 235

fault isolation (swimlanes), 26, 148–154, 234

fault tolerance

series, 158–162, 235

SPOFs (single points of failure), 155–157, 235

swimlanes (fault isolation), 148–154, 234

Wire On/Wire Off frameworks, 162–166, 236

fifth normal form, 132

file markdowns, 165

file systems, 55

files, log files, 66–68, 225

aggregating, 66–67

errors in, 68

monitoring, 67

Firesheep, 172

firewalls, 62–65, 225

first normal form, 132

flexibility, 57–58

focus groups, 115

FOR UPDATE clause (SELECT statement), 140–142, 234

foreign keys, 26

fourth normal form, 132

frequency, 198

functionality, separating, 29–31, 221

functions

dbquery, 109

setcookie, 172

G

Garrett, Jesse James, 96

GFS (Google File System), 55

Google

BigTable, 56

GFS (Google File System), 55

MapReduce, 59

H

Hadoop, 59

header() command, 93

headers

Cache-Control, 92–93, 98

ETag, 102–103

Expires, 91–95, 98, 228

Last-Modified, 98

High Reliability Organizations, 124

homogenous networks, 19–20, 220

horizontal scale, 25–29, 220. See also scaling out

HTML meta tags, 91

HTTP (Hypertext Transfer Protocol), 77

headers, 91

Cache-Control, 92–93, 98

ETag, 102–103

Expires, 91–95, 98, 228

Last-Modified, 98

keep-alives, 93

status codes, 77–78

I

implementation

D-I-D (Design-Implement-Deploy) approach, 8

simplifying, 12

implicit locks, 134

International Obfuscated C Code Contest, 5

isolating faults, 26, 148–154, 234

issue identification (postmortems), 125

J-K

java.util.logging, 66

JIT (Just In Time) Scalability, D-I-D approach, 218

deployment, 8–9

design, 7

explained, 6–7

implementation, 8

keep-alives, 93

key-value stores, 56

L

LaPorte, Todd, 124

Last-Modified headers, 98

learning from mistakes

designing for rollback, 120–123, 231

importance of, 113–116, 230

postmortem process, 123–127, 232

QA (quality assurance), 117–120, 231

legal requirements, 75

locks (database), 134–137, 233

log files, 66–68, 225

aggregating, 66–67

errors in, 68

monitoring, 67

Log4j logs, 66

lookups (DNS), reducing number of, 12–14, 219

M

MapReduce, 59

Mark Up/Mark Down functionality, 163, 166

Maslow’s hammer, 53

Maslow, Abraham, 53

master-slave relationship, 28

max-age directive, 92

mean time to failure (MTTF), 73

Memcached, 56, 108

memory caching. See caching

message buses

overcrowding, 188–190, 239

scaling, 183, 186, 238

meta tags, 91

minimum viable product, 10

mistakes, learning from

designing for rollback, 120–123, 231

importance of, 113–116, 230

postmortem process, 123–127, 232

QA (quality assurance), 117–120, 231

mod_alias module, 79

mod_expires module, 93

mod_rewrite module, 80

MogileFS, 55

monetization, 198

monitoring, 67, 204–208, 241

Moore’s Law, 39

Moore, Gordon, 39

MTTF (mean time to failure), 73

multiphase commits, 137–139, 233

multiple live sites, 47–48

multiplicity effect, 161

N

NCache, 108

networks

CDNs (content delivery networks), 88–90, 227

homogenous networks, 19–20, 220

no-cache directive, 92

nodes, 88

Normal Accident Theory, 124

normal forms, 132

normalization, 131

NoSQL, 56

O

object caches, 107–111, 229–230

objects

object caches, 107–111, 229

reducing number of, 16–19, 220

XMLHttpRequest, 96

OJB, 108

OLTP (On Line Transactional Processing), 26, 54

on-demand enabling/disabling of services, 163, 166

optimizers, 136

overcrowding message buses, 188–191, 239

overengineering, avoiding, 2–5, 218

P

page caches, 100–103, 228

page locks, 135

Pareto Principle, 10

Perrow, Charles, 124

PNUTS, 57

pods, 32–34, 148–149

pools, 148–149

postmortem process, 123–127, 232

PRG (Post/Redirect/Get), 77

private directive, 92

public directive, 92

purging storage, 196–200, 240

Q-R

QA (quality assurance), 117–120, 231

RDBMSs (Relational Database Management Systems), 26

alternatives to, 55–61

when to use, 54–61, 224

recency, frequency, and monetization (RFM) analysis, 197–200

redirects, avoiding, 76–81, 226

reducing

complexity, 9–10, 219

design, 11

implementation, 12

scope, 10–11

DNS lookups, 12–14, 219

objects, 16–19, 220

regulatory requirements, 75

Reis, Eric, 10

Relational Database Management Systems. See RDBMSs

“A Relational Model of Data for Large Shared Data Banks” (Codd), 26, 54

relationships, 57–58, 130–133, 232

relaxing temporal constraints, 81–84, 227

replication of services/databases, 25–29, 220

reverse proxy cache, 101, 103

reverse proxy servers, 101, 103

RFM (recency, frequency, and monetization) analysis, 197–200

risk management

firewalls, 62–65, 225

risk-benefit model, 213–218

rolling back code, 120–123, 231

row locks, 135

runtime variables, 165

S

Saas (Software as a Service) solution, 13

scaling out, 222

cloud computing, 48–50, 224

commodity systems, 39–42, 223

data centers, 42–47, 223

defined, 36

design, 36–39, 222

multiple live sites, 47–48

scaling up, 36

scope

scope creep, 3

simplifying, 10–11

second normal form, 132

Secure Socket Layer (SSL), 173

security

firewalls, 62–65, 225

sidejacking, 172

SSL (Secure Socket Layer), 173

SELECT statement

* wildcard, 142–144, 234

FOR UPDATE clause, 140–142, 234

separating functionality or services, 29–31, 221

series, 158–162, 235

servers

edge servers, 88

page caches, 100–103, 228

services

cloning and replication, 25–29, 220

enabling/disabling on demand, 163, 166

scale through, 32–34, 222

separating, 29–31, 221

session data, maintaining in browser, 171–173, 237

setcookie function, 172

shards, 32–34, 148–149

sidejacking, 172

simple solutions, 9–10, 219

design, 11

implementation, 12

importance of, 2–5, 218

scope, 10–11

SimpleDB, 57

single points of failure (SPOFs), 155–157, 235

singleton antipattern, 155

singletons, 155

sixth normal form, 132

social construction, 115

social contagion, 114

Software as a Service (SaaS) solution, 13

solutions

importance of simple solutions, 2–5, 218

overengineering, 2–5, 218

simplifying, 9–10, 219

design, 11

implementation, 12

scope, 10–11

spinning up, 49

splits

of message bus, 183–188, 238

of similar data sets across storage and application systems, 32–34, 222

X axis splits (AKF Scale Cube), 25–29, 220

Y axis splits (AKF Scale Cube), 29–31, 221

Z axis splits (AKF Scale Cube), 32–34, 222

SPOFs (single points of failure), 155–157, 235

SSL (Secure Socket Layer), 173

stand-in services, 164

state, 167–168

decision flowchart for implementing state, 168

distributed cache, 173–176, 237

session data, maintaining in browser, 171–173, 237

statelessness, 168–171, 236

statelessness, 43, 168–171, 236

statements, SELECT

* wildcard, 142–144, 234

FOR UPDATE clause, 140–142, 234

status codes (HTTP), 77–78

storage

archiving, 196–200, 240

databases. See databases

document stores, 57

extensible record stores, 56

file systems, 55

Hadoop, 59

key-value stores, 56

MapReduce, 59

NoSQL, 56

purging, 196–200, 240

RFM (recency, frequency, and monetization) analysis, 197–200

scalability versus flexibility, 57–58

swimlanes (fault isolation), 148–154, 234

synchronous markdown commands, 164

SystemErr logs, 66

SystemOut logs, 66

T

table locks, 135

tags, meta tags, 91

TCSP (Temporal Constraint Satisfaction Problem), 82

temporal constraints, relaxing, 81–84, 227

third normal form, 26, 132

third-party scaling products, 193, 195–196, 239

Three Mile Island nuclear accident, 124

three-phase commit (3PC), 138

timelines, 125

Tokyo Tyrant, 56

Tomcat log files, 66

traffic redirection, avoiding, 76–81, 226

transactions

multiphase commits, 137–139, 233

removing business intelligence from transaction processing, 201–204, 240

two-phase commit (2PC), 138–139

U-V

usefulness, 2

vendor scaling products, 193–196, 239

viral growth, 114

virtualization, 41, 154

Voldemort, 56

W

webpagetest.org, 94

Websphere log files, 66

wildcards, * (asterisk), 142–144, 234

Wire On/Wire Off frameworks, 162–163, 166, 236

work distribution

cloning and replication, 25–29, 220

explained, 23–24

separating functionality or services, 29–31, 221

splitting similar data sets across storage and application systems, 32–34, 222

X-Y-Z

X axis splits (AKF Scale Cube), 25–29, 220

XMLHttpRequest object, 96

Y axis splits (AKF Scale Cube), 29–31, 221

Z axis splits (AKF Scale Cube), 32–34, 222

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset