Index
A
- acceptable availability, What’s Reasonable?
- alerts, Focus #5: Respond to Availability Issues in a Predictable and Defined Way, How Many and Which Internal SLAs?
- allocated-capacity resources, Allocated-Capacity Resource Allocation-Reserved Capacity
- Amazon
- Amazon API Gateway, Mobile Backend
- Amazon DynamoDB, Allocated-Capacity Resource Allocation
- Amazon EC2, Raw Resource
- Amazon Elastic Load Balancer (ELB), Changing Allocations
- Amazon Kinesis, Internet of Things Data Intake
- Amazon S3, Managed Resource (Non-server-based)
- Amazon Web Services (see AWS)
- API contracts, The Ownership Benefit
- API Gateway, Mobile Backend
- application availability (see availability)
- applications
- automated change management, Automate Your Manual Processes-Automated Change Sanity Testing
- automation
- AutoScaling, Changing Allocations
- availability, Availability
- acceptable, What’s Reasonable?
- automation of manual processes, Automate Your Manual Processes-Automated Change Sanity Testing
- (see also automated change management)
- basics, What Is Availability?-What Causes Poor Availability?
- before implementing scaling, Examine Your Application Regularly
- building applications with failure in mind, Focus #1: Build with Failure in Mind
- building applications with scaling in mind, Focus #2: Always Think About Scaling
- causes of poor, What Causes Poor Availability?
- defined, Availability Versus Reliability
- expressed as percentage, Measuring Availability
- factoring maintenance windows into, Don’t Be Fooled
- icon failure example, Five Focuses to Improve Application Availability
- improvement after slippage in, Improving Your Availability When It Slips-Keeping on Top of Availability
- improvement techniques, Five Focuses to Improve Application Availability-Being Prepared
- maintaining AWS location diversity for, Maintaining Location Diversity for Availability Reasons
- measuring, Measuring Availability-Availability by the Numbers
- measuring/tracking current percentage, Measure and Track Your Current Availability
- monitoring as feature of application design, Focus #4: Monitor Availability
- predictable/defined response to problems, Focus #4: Monitor Availability
- reasonable number for, What’s Reasonable?
- reliability vs., Availability Versus Reliability
- risk mitigation in design, Focus #3: Mitigate Risk-Focus #3: Mitigate Risk
- SLAs and, What are Service-Level Agreements?
- system improvement, Improve Your Systems
- the nines, The Nines
- availability percentage, Measure and Track Your Current Availability
- availability pool, The “Magic” of Usage-Based Resource Allocation
- Availability Zones (AZs), AWS Availability Zone
- AWS (Amazon Web Services)
- API Gateway, Mobile Backend
- architecture, AWS Architecture-Architecture Overview
- AutoScaling, Changing Allocations
- Availability Zones (see Availability Zones)
- data centers, Data Center, Availability Zones Are Not Data Centers-Availability Zones Are Not Data Centers
- DynamoDB, Allocated-Capacity Resource Allocation
- EC2 (see Amazon EC2)
- ecosystem terms, AWS Architecture-Data Center
- Elastic Load Balancer, Changing Allocations
- Kinesis, Internet of Things Data Intake
- Lambda (see AWS Lambda)
- maintaining location diversity for availability reasons, Maintaining Location Diversity for Availability Reasons
- overview, Architecture Overview-Architecture Overview
- Regions (see AWS Regions)
- S3 (see Amazon S3)
- SLAs, What are Service-Level Agreements?
- AWS Lambda, AWS Lambda-Advantages and Disadvantages of Lambda
- AWS Regions, AWS Region, Architecture Overview
C
- call latency, Performance Measurements for SLAs-Latency Groups
- capabilities, shared, Guideline #4: Shared Capabilities/Data
- capacity units, Allocated-Capacity Resource Allocation
- cascading service failures, Cascading Service Failures
- Chaos Monkey, Concerns with Running Game Days in Production
- circuit breakers, Focus #1: Build with Failure in Mind, Determining Failures
- cloud-based servers, Cloud-Based Servers
- cloud-based services, Cloud
- allocated capacity resource allocation, Allocated-Capacity Resource Allocation-Reserved Capacity, The Pros and Cons of Resource Allocation Techniques
- application management, Greater Focus on the Application
- AWS architecture, AWS Architecture-Architecture Overview
- AWS Availability Zones, AWS Availability Zone, Availability Zones Are Not Data Centers-Availability Zones Are Not Data Centers
- (see also Availability Zones (AZs))
- AWS Lambda, Microcompute, AWS Lambda-Advantages and Disadvantages of Lambda
- AWS Region, AWS Region
- changes in, Change and the Cloud-Change Continues
- choosing scalable computing options, Now What?
- CloudWatch, Monitoring and CloudWatch
- compute slices, Compute Slices
- data centers, Data Center, Availability Zones Are Not Data Centers-Availability Zones Are Not Data Centers
- distributing applications across, Distributing the Cloud-Maintaining Location Diversity for Availability Reasons
- dynamic containers, Dynamic Containers
- implications of managed service, Implications of Using Managed Resources
- implications of non-managed service, Implications of Using Non-Managed Resources
- maintaining location diversity for availability reasons, Maintaining Location Diversity for Availability Reasons
- managed infrastructure, Managed Infrastructure-Monitoring and CloudWatch
- micro startups, The Micro Startup
- microcomputing, Microcompute
- microservice-based architectures, Acceptance of Microservice-Based Architectures
- monitoring, Monitoring and CloudWatch
- non-server-based managed resource, Managed Resource (Non-server-based)
- raw resource, Raw Resource-Raw Resource
- resource allocation, Cloud Resource Allocation-The Pros and Cons of Resource Allocation Techniques
- scalable computing options, Scalable Computing Options-Now What?
- security improvements, Security and Compliance Has Matured
- server-based managed resource, Managed Resource (Server-Based)
- servers, Cloud-Based Servers
- small/specialized services on, Smaller, More Specialized Services
- structure of, Structure of Cloud-Based Services-Managed Resource (Non-server-based)
- usage-based resource allocation, Usage-Based Resource Allocation-The Pros and Cons of Resource Allocation Techniques
- CloudWatch
- complexity
- compute slices, Compute Slices
- configuration management
- containers, dynamic, Dynamic Containers
- content delivery networks (CDNs), Focus #2: Always Think About Scaling
- content, dynamic vs. static, Focus #2: Always Think About Scaling
- continuous improvement, Continuous Improvement-The Importance of Continuous Improvement
- contracts, What are Service-Level Agreements?
- credit card data, Guideline #1: Specific Business Requirements, Separate team for security reasons
- critical dependency, Critical Dependency
- customers, service failures caused by, Customer-Caused Problems
D
- dashboards, How Many and Which Internal SLAs?
- data
- data centers, Data Center
- data partitioning, Data Partitioning-Data Partitioning
- death spiral, Preface
- Denial of Service attacks, Focus #1: Build with Failure in Mind
- dependencies
- dependency failure, Five Focuses to Improve Application Availability
- deploys, automated, Automated Deploys
- disaster recovery plans, Disaster Recovery Plans
- distributed ownership, Service Ownership
- documentation, Configuration Management
- downtime
- drift, Automate Your Manual Processes
- dynamic containers, Dynamic Containers
- dynamic content, Focus #2: Always Think About Scaling
- DynamoDB, Allocated-Capacity Resource Allocation
M
- maintenance windows, Don’t Be Fooled
- managed infrastructure, Managed Infrastructure-Monitoring and CloudWatch
- managed resource (non-server-based), Managed Resource (Non-server-based)
- managed resource (server-based), Managed Resource (Server-Based)
- management, sharing risk matrix with, Maintaining the Risk Matrix
- micro startups, The Micro Startup
- microcomputing, Microcompute
- microservices, Using Microservices-The Right Balance
- mitigation plans, Focus #3: Mitigate Risk, Mitigation Plan, Risk Mitigation
- (see also risk mitigation)
- mobile backend, AWS Lambda application, Mobile Backend
- monitoring
- monolithic applications, services vs., The Monolith Application
N
- Netflix, Concerns with Running Game Days in Production
- Network Access Control List (ACL), Raw Resource
- New Relic, Why I Wrote This Book, Focus #4: Monitor Availability
- nines, the, The Nines, Self-Repair
- node failures
- non-managed service, Implications of Using Non-Managed Resources
- non-server-based managed resource, Managed Resource (Non-server-based)
- noncritical dependency, Noncritical Dependency
O
- operational processes, automation of, Operational Processes
- ownership (see service ownership) (see Single Team Owned Service Architecture) (see team ownership)
P
- parallel systems, Redundancy Improvements That Increase Complexity
- partitioning, Data Partitioning-Data Partitioning
- partitioning key, Data Partitioning-Data Partitioning
- payment processing, Guideline #1: Specific Business Requirements
- planning, risk matrix for, Using the Risk Matrix for Planning
- predictable responses, Focus #4: Monitor Availability, Predictable Response, Determining Failures
- problem diagnosis, SLAs for, SLAs for Problem Diagnosis
- production environment
R
- random failures, Concerns with Running Game Days in Production
- raw cloud resource, Raw Resource-Raw Resource
- RDS (Relational Database Service), Managed Resource (Server-Based)
- reasonable responses, Reasonable Response
- rebooting, Operational Processes
- recovery plans, Recovery Plans
- reduced functionality, Graceful Degradation
- redundancy, Redundancy
- reliability
- repartitioning, Data Partitioning-Data Partitioning
- repeatable tasks, benefits of, Automate Your Manual Processes
- reserved capacity, Reserved Capacity
- resource allocation (cloud resources), Cloud Resource Allocation-The Pros and Cons of Resource Allocation Techniques
- resource exhaustion, What Causes Poor Availability?
- responses
- responsiveness, service tiers and, Responsiveness-Responsiveness
- risk
- high likelihood, high severity, T-Shirt Photos: High Likelihood, High Severity Risk
- high likelihood, low severity, Custom Fonts: High Likelihood, Low Severity Risk
- likelihood component, Likelihood Versus Severity
- likelihood vs. severity, Likelihood Versus Severity-T-Shirt Photos: High Likelihood, High Severity Risk
- low likelihood, high severity, The Order Database: Low Likelihood, High Severity Risk
- low likelihood, low severity, The Top 10 List: Low Likelihood, Low Severity Risk
- severity component, Likelihood Versus Severity
- significance of a, Likelihood Versus Severity
- risk management, What Is Risk Management?-Managing Risk Summary, Risk Management
- addressing worst offenders, Remove Worst Offenders
- before implementing scaling, Examine Your Application Regularly
- decisions involved in, Managing Risk
- high likelihood, high severity risk, T-Shirt Photos: High Likelihood, High Severity Risk
- high likelihood, low severity risk, Custom Fonts: High Likelihood, Low Severity Risk
- identifying risks, Identify Risk
- likelihood vs. severity in, Likelihood Versus Severity-T-Shirt Photos: High Likelihood, High Severity Risk
- low likelihood, high severity risk, The Order Database: Low Likelihood, High Severity Risk
- low likelihood, low severity risk, The Top 10 List: Low Likelihood, Low Severity Risk
- risk matrix for, The Risk Matrix-Maintaining the Risk Matrix
- risk matrix reviews, Review Regularly
- risk mitigation vs., Risk Mitigation
- risk mitigators, Focus #3: Mitigate Risk-Focus #3: Mitigate Risk, Mitigate
- two mistakes high design method, Two Mistakes High-The Space Shuttle
- risk matrix, Identify Risk, The Risk Matrix-Maintaining the Risk Matrix
- brainstorming list of risks, Brainstorming the List
- creating, Creating the Risk Matrix-Triggered Plan
- filling in details on, Risk Item Details
- ideas for input, Identify Risk
- information kept in, The Risk Matrix
- maintaining, Maintaining the Risk Matrix
- mitigation column, Risk Mitigation
- mitigation plan, Mitigation Plan
- reviewing regularly, Improve Your Systems, Review Regularly
- rotating reviewers, Maintaining the Risk Matrix
- scope of, Scope of the Risk Matrix
- setting likelihood and severity fields, Set the Likelihood and Severity Fields
- sharing with management, Maintaining the Risk Matrix
- template for, Creating the Risk Matrix
- triggered plan, Triggered Plan
- using for planning, Using the Risk Matrix for Planning
- risk mitigation, Risk Mitigation-Improving Our Risk Situation
- application design, Focus #3: Mitigate Risk-Focus #3: Mitigate Risk
- building systems with reduced risk, Building Systems with Reduced Risk-Operational Processes
- disaster recovery plans, Disaster Recovery Plans
- game days for testing, Game Days-Game Day Testing
- idempotent interfaces, Redundancy
- improvements that increase complexity, Redundancy Improvements That Increase Complexity
- independence, Independence
- mitigation plan, Mitigation Plan
- operational process automation, Operational Processes
- recovery plans, Recovery Plans
- redundancy, Redundancy
- risk management vs., Risk Mitigation
- security, Security
- self repair, Self-Repair-Self-Repair
- simplicity, Simplicity
- triggered plan, Triggered Plan
- two mistakes high design method, Two Mistakes High-The Space Shuttle
- web-based T-shirt store example, Focus #3: Mitigate Risk
- risk mitigators, Mitigate
- rollback, Change Experiments and High Frequency Changes
- rolling deploy, Problems During Upgrades
S
- S3 (see Amazon S3)
- sanity test, Automated Change Sanity Testing
- scaling, Scaling
- security
- self-repairing processes, Self-Repair-Self-Repair
- server-based managed resource, Managed Resource (Server-Based)
- servers
- service boundaries, What Should Be a Service?-Mixed Reasons
- service failures, Dealing with Service Failures-Provide service limits
- appropriate action for, Appropriate Action-Provide service limits
- cascading, Cascading Service Failures
- catching responses that never arrive, Determining Failures
- customer-caused problems, Customer-Caused Problems
- determining, Determining Failures-Determining Failures
- failure loops, Failure Loops
- graceful backoff, Graceful Backoff
- graceful degradation, Graceful Degradation
- hidden shared failure types, Hidden Shared Failure Types
- importance of failing early, Fail as Early as Possible
- predictable response, Predictable Response
- reasonable response, Reasonable Response
- reduced functionality, Graceful Degradation
- responding to, Responding to a Service Failure-Reasonable Response
- service limits, Provide service limits
- understandable response to, Understandable Response
- service limits, Provide service limits
- service ownership, Service Ownership-What Does it Mean to Be a Service Owner?
- service tiers
- application complexity and, Application Complexity-Application Complexity
- assigning tier labels to services, Assigning Service Tier Labels to Services-Tier 4
- basics, Service Tiers-Example: Online Store
- critical dependency, Critical Dependency
- defined, Measure and Track Your Current Availability, What Are Service Tiers?
- dependencies and, Dependencies-Noncritical Dependency
- expectations and, Expectations
- noncritical dependencies and, Noncritical Dependency
- online store example, Example: Online Store-Example: Online Store
- responsiveness and, Responsiveness-Responsiveness
- tier 1, Tier 1
- tier 2, Tier 2
- tier 3, Tier 3
- tier 4, Tier 4
- using, Using Service Tiers-Summary
- Service-Level Agreements (see SLAs)
- services, Services
- balance in number of, The Right Balance
- benefits, The Service-Based Application
- creating excessive boundaries, Going Too Far
- criteria for, Using Microservices
- defined, Services and Microservices
- failure of (see service failures)
- guidelines for separating applications into, What Should Be a Service?-Mixed Reasons
- monolithic application vs., The Monolith Application
- preparing for scaling, Microservices
- reasons for using, Why Use Services?-The Scaling Benefit
- scaling benefits, The Scaling Benefit
- service boundaries for, What Should Be a Service?-Mixed Reasons
- service-based application, The Service-Based Application-The Service-Based Application
- stateless, Stateless Services
- using, Using Microservices-The Right Balance
- severity, risk
- shared capabilities
- significance of a risk, Likelihood Versus Severity
- simplicity, risk mitigation and, Simplicity
- Single Team Owned Service Architecture (STOSA), Service Ownership-What Does it Mean to Be a Service Owner?
- SLAs (Service-Level Agreements), The Ownership Benefit, Service-Level Agreements-Additional Comments on SLAs
- basics, What are Service-Level Agreements?-What are Service-Level Agreements?
- building trust with, SLAs as Trust
- defined, What are Service-Level Agreements?
- determining number and type of, How Many and Which Internal SLAs?
- external vs. internal, External Versus Internal SLAs
- latency groups, Latency Groups
- limit SLAs, Limit SLAs
- performance measurements for, Performance Measurements for SLAs-Latency Groups
- problem diagnosis and, SLAs for Problem Diagnosis
- Top Percentile SLAs, Top Percentile SLAs-Top Percentile SLAs
- slippage, improving availability after, Improving Your Availability When It Slips-Keeping on Top of Availability
- slow dependencies
- Software as a Service (SaaS), A Word on Scale Today
- space shuttle program, The Space Shuttle
- staging environment, testing recovery plans in, Staging Versus Production Environments
- startups, cloud and, The Micro Startup
- stateless services, Stateless Services
- static content, Focus #2: Always Think About Scaling
- STOSA-based application, Single Team Owned Service Architecture
- STOSA-based organization, Single Team Owned Service Architecture
- support manuals, Focus #5: Respond to Availability Issues in a Predictable and Defined Way
T
- team ownership
- teams
- technical debt, What Causes Poor Availability?, Brainstorming the List
- test environment, testing recovery plans in, Staging Versus Production Environments
- testing, recovery plan (see game days)
- throughput capacity units, Allocated-Capacity Resource Allocation
- tier 1 services, Tier 1
- tier 2 services, Tier 2
- tier 3 services, Tier 3
- tier 4 services, Tier 4
- Top Percentile SLAs, Top Percentile SLAs-Top Percentile SLAs
- traffic volume, SLAs and, Performance Measurements for SLAs
- triggered plans, Triggered Plan
- trust, SLAs and, SLAs as Trust
- two mistakes high design method, Two Mistakes High-The Space Shuttle
- application management, Managing Your Applications
- data center resiliency, Data Center Resiliency-Then, how many servers do you need?
- defined, What Is “Two Mistakes High”?
- failure loops, Failure Loops
- hidden shared failure types, Hidden Shared Failure Types
- node failure, Losing a Node-Losing a Node
- practice of, “Two Mistakes High” in Practice-Failure Loops
- problems during upgrades, Problems During Upgrades
- space shuttle program, The Space Shuttle
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.