* wildcard (SELECT statement), 142–144, 234
2PC (two-phase commit), 138–139
37signals, 10
3PC (three-phase commit), 138
80-20 rule, 10
300 Multiple Choices status code, 77
301 Moved Permanently status code, 77
302 Found status code, 77
303 See Other status code, 77
304 Not Modified status code, 77
305 Use Proxy status code, 78
306 (Unused) status code, 78
307 Temporary Redirect status code, 78
ACID properties, 26, 54, 129–130
actions, identifying, 126
aggregating log files, 66–67
“Ajax: A New Approach to Web Applications” (Garrett), 96
Ajax (Asynchronous JavaScript and XML), 95–100, 228
AKF Partners’ D-I-D (Design-Implement-Deploy) approach, 218
deployment, 8–9
design, 7
explained, 6–7
implementation, 8
AKF Scale Cube
explained, 24
illustrated, 24–25
RFM (recency, frequency, and monetization) analysis, 198
Apache
Hadoop, 59
log files, 66
mod_alias module, 79
mod_expires module, 93
mod_rewrite module, 80
OJB, 108
application caches, 103–107, 229
applications, monitoring, 204–208, 241
The Art of Scalability, 23
asterisk (*) wildcard, 142–144, 234
asynchronous communication, 179–180
advantages of, 180
and fault isolation swimlanes, 152–154
asynchronous completion, 76
message buses
Asynchronous JavaScript and XML (Ajax), 95–100, 228
atomicity, 26
automatic markdowns, 164
avoiding overengineering, 2–5, 218
backbones, 88
BASE (Basically Available, Soft State, and Eventually Consistent) architecture, 83
BigTable, 56
Boyce-Codd normal form, 132
browsers, maintaining session data in, 171–173, 237
business intelligence, removing from transaction processing, 201–204, 240
business operations, learning from, 116
cache misses, 102
Cache-Control headers, 92–93, 98
caching, 87–88
application caches, 103–107, 229
cache misses, 102
CDNs (content delivery networks), 88–90, 227
distributed cache, 173–176, 237
Last-Modified headers, 98
object caches, 107–111, 229–230
Cagan, Marty, 10
Cassandra, 56
CDNs (content delivery networks), 88–90, 227
Ceph, 55
checking work, avoiding, 72–76, 226
circuits
in parallel, 160
in series, 158–159
clauses, FOR UPDATE, 140–142, 234
commands, header(), 93
communication. See asynchronous communication
competitive differentiation, 75
complexity
design, 11
implementation, 12
scope, 10–11
config file markdowns, 164
consistency, 26
Constraint Satisfaction Problems (CSP), 82
constraints, temporal, 81–84, 227
content delivery networks (CDNs), 88–90, 227
Cost-Value Data Dilemma, 61
CouchDB, 57
Craigslist, 170
CSP (Constraint Satisfaction Problems), 82
customers, learning from, 115–116
D-I-D (Design-Implement-Deploy) approach, 218
deployment, 8–9
design, 7
explained, 6–7
implementation, 8
data centers, scaling out, 42–47, 223
data definition language (DDL), 131
databases
ACID properties, 129–130
alternatives to, 55–61
cloning and replication, 25–29, 220
clustering, 195
entities, 131
markdowns, 165
multiphase commits, 137–139, 233
normal forms, 132
normalization, 131
optimizers, 136
SELECT statement
FOR UPDATE clause, 140–142, 234
dbquery function, 109
DDL (data definition language), 131
decision flowchart for implementing state, 168
deployment, 8–9
design
D-I-D (Design-Implement-Deploy) approach, 7
designing for fault tolerance
SPOFs (single points of failure), 155–157, 235
swimlanes (fault isolation), 148–154, 234
Wire On/Wire Off frameworks, 162–163, 166, 236
simplifying, 11
Design-Implement-Deploy (D-I-D) approach, 6–9
directives, 92
distributed cache, 173–176, 237
distributing work
cloning and replication, 25–29, 220
explained, 23–24
separating functionality or services, 29–31, 221
splitting similar data sets across storage and application systems, 32–34, 222
DNS lookups, reducing number of, 12–14, 219
document stores, 57
duplicated work, avoiding, 71
avoiding checking work, 72–76, 226
avoiding redirects, 76–81, 226
relaxing temporal constraints, 81–84, 227
duplication of services/databases, 25–29, 220
durability, 26
eBay, 170
edge servers, 88
enterprise service buses. See message buses
entities, 131
ERDs (entity relationship diagrams), 131
errors in log files, 68
ETag headers, 102–103
Expires headers, 91–95, 98, 228
ExpiresActive module, 93
explicit locks, 135
extensible record stores, 56
extent locks, 135
failures, learning from
designing for rollback, 120–123, 231
postmortem process, 123–127, 232
QA (quality assurance), 117–120, 231
SPOFs (single points of failure), 155–157, 235
fault isolation (swimlanes), 26, 148–154, 234
fault tolerance
SPOFs (single points of failure), 155–157, 235
swimlanes (fault isolation), 148–154, 234
Wire On/Wire Off frameworks, 162–166, 236
fifth normal form, 132
file markdowns, 165
file systems, 55
aggregating, 66–67
errors in, 68
monitoring, 67
Firesheep, 172
first normal form, 132
flexibility, 57–58
focus groups, 115
FOR UPDATE clause (SELECT statement), 140–142, 234
foreign keys, 26
fourth normal form, 132
frequency, 198
functionality, separating, 29–31, 221
functions
dbquery, 109
setcookie, 172
Garrett, Jesse James, 96
GFS (Google File System), 55
BigTable, 56
GFS (Google File System), 55
MapReduce, 59
Hadoop, 59
header() command, 93
headers
ETag, 102–103
Last-Modified, 98
High Reliability Organizations, 124
homogenous networks, 19–20, 220
horizontal scale, 25–29, 220. See also scaling out
HTML meta tags, 91
HTTP (Hypertext Transfer Protocol), 77
headers, 91
ETag, 102–103
Last-Modified, 98
keep-alives, 93
status codes, 77–78
implementation
D-I-D (Design-Implement-Deploy) approach, 8
simplifying, 12
implicit locks, 134
International Obfuscated C Code Contest, 5
isolating faults, 26, 148–154, 234
issue identification (postmortems), 125
java.util.logging, 66
JIT (Just In Time) Scalability, D-I-D approach, 218
deployment, 8–9
design, 7
explained, 6–7
implementation, 8
keep-alives, 93
key-value stores, 56
LaPorte, Todd, 124
Last-Modified headers, 98
learning from mistakes
designing for rollback, 120–123, 231
postmortem process, 123–127, 232
QA (quality assurance), 117–120, 231
legal requirements, 75
locks (database), 134–137, 233
aggregating, 66–67
errors in, 68
monitoring, 67
Log4j logs, 66
lookups (DNS), reducing number of, 12–14, 219
MapReduce, 59
Mark Up/Mark Down functionality, 163, 166
Maslow’s hammer, 53
Maslow, Abraham, 53
master-slave relationship, 28
max-age directive, 92
mean time to failure (MTTF), 73
memory caching. See caching
message buses
meta tags, 91
minimum viable product, 10
mistakes, learning from
designing for rollback, 120–123, 231
postmortem process, 123–127, 232
QA (quality assurance), 117–120, 231
mod_alias module, 79
mod_expires module, 93
mod_rewrite module, 80
MogileFS, 55
monetization, 198
Moore’s Law, 39
Moore, Gordon, 39
MTTF (mean time to failure), 73
multiphase commits, 137–139, 233
multiple live sites, 47–48
multiplicity effect, 161
NCache, 108
networks
CDNs (content delivery networks), 88–90, 227
homogenous networks, 19–20, 220
no-cache directive, 92
nodes, 88
Normal Accident Theory, 124
normal forms, 132
normalization, 131
NoSQL, 56
object caches, 107–111, 229–230
objects
reducing number of, 16–19, 220
XMLHttpRequest, 96
OJB, 108
OLTP (On Line Transactional Processing), 26, 54
on-demand enabling/disabling of services, 163, 166
optimizers, 136
overcrowding message buses, 188–191, 239
overengineering, avoiding, 2–5, 218
page locks, 135
Pareto Principle, 10
Perrow, Charles, 124
PNUTS, 57
pools, 148–149
postmortem process, 123–127, 232
PRG (Post/Redirect/Get), 77
private directive, 92
public directive, 92
QA (quality assurance), 117–120, 231
RDBMSs (Relational Database Management Systems), 26
alternatives to, 55–61
recency, frequency, and monetization (RFM) analysis, 197–200
redirects, avoiding, 76–81, 226
reducing
design, 11
implementation, 12
scope, 10–11
regulatory requirements, 75
Reis, Eric, 10
Relational Database Management Systems. See RDBMSs
“A Relational Model of Data for Large Shared Data Banks” (Codd), 26, 54
relationships, 57–58, 130–133, 232
relaxing temporal constraints, 81–84, 227
replication of services/databases, 25–29, 220
reverse proxy servers, 101, 103
RFM (recency, frequency, and monetization) analysis, 197–200
risk management
risk-benefit model, 213–218
rolling back code, 120–123, 231
row locks, 135
runtime variables, 165
Saas (Software as a Service) solution, 13
scaling out, 222
defined, 36
multiple live sites, 47–48
scaling up, 36
scope
scope creep, 3
simplifying, 10–11
second normal form, 132
Secure Socket Layer (SSL), 173
security
sidejacking, 172
SSL (Secure Socket Layer), 173
SELECT statement
FOR UPDATE clause, 140–142, 234
separating functionality or services, 29–31, 221
servers
edge servers, 88
services
cloning and replication, 25–29, 220
enabling/disabling on demand, 163, 166
session data, maintaining in browser, 171–173, 237
setcookie function, 172
sidejacking, 172
design, 11
implementation, 12
scope, 10–11
SimpleDB, 57
single points of failure (SPOFs), 155–157, 235
singleton antipattern, 155
singletons, 155
sixth normal form, 132
social construction, 115
social contagion, 114
Software as a Service (SaaS) solution, 13
solutions
importance of simple solutions, 2–5, 218
design, 11
implementation, 12
scope, 10–11
spinning up, 49
splits
of similar data sets across storage and application systems, 32–34, 222
X axis splits (AKF Scale Cube), 25–29, 220
Y axis splits (AKF Scale Cube), 29–31, 221
Z axis splits (AKF Scale Cube), 32–34, 222
SPOFs (single points of failure), 155–157, 235
SSL (Secure Socket Layer), 173
stand-in services, 164
state, 167–168
decision flowchart for implementing state, 168
distributed cache, 173–176, 237
session data, maintaining in browser, 171–173, 237
statelessness, 43, 168–171, 236
statements, SELECT
FOR UPDATE clause, 140–142, 234
status codes (HTTP), 77–78
storage
databases. See databases
document stores, 57
extensible record stores, 56
file systems, 55
Hadoop, 59
key-value stores, 56
MapReduce, 59
NoSQL, 56
RFM (recency, frequency, and monetization) analysis, 197–200
scalability versus flexibility, 57–58
swimlanes (fault isolation), 148–154, 234
synchronous markdown commands, 164
SystemErr logs, 66
SystemOut logs, 66
table locks, 135
tags, meta tags, 91
TCSP (Temporal Constraint Satisfaction Problem), 82
temporal constraints, relaxing, 81–84, 227
third-party scaling products, 193, 195–196, 239
Three Mile Island nuclear accident, 124
three-phase commit (3PC), 138
timelines, 125
Tokyo Tyrant, 56
Tomcat log files, 66
traffic redirection, avoiding, 76–81, 226
transactions
multiphase commits, 137–139, 233
removing business intelligence from transaction processing, 201–204, 240
two-phase commit (2PC), 138–139
usefulness, 2
vendor scaling products, 193–196, 239
viral growth, 114
Voldemort, 56
webpagetest.org, 94
Websphere log files, 66
wildcards, * (asterisk), 142–144, 234
Wire On/Wire Off frameworks, 162–163, 166, 236
work distribution
cloning and replication, 25–29, 220
explained, 23–24
separating functionality or services, 29–31, 221
splitting similar data sets across storage and application systems, 32–34, 222
X axis splits (AKF Scale Cube), 25–29, 220
XMLHttpRequest object, 96