Chapter 3. Istio at a Glance

As organizations mature in their operations of container deployments, many will come to utilize a service mesh within their environment. This topic causes significant buzz within the cloud native ecosystem. Currently, many administrators, operators, architects, and developers are seeking an understanding and guidance with respect to how, when, and why to adopt a service mesh, so let’s look at Istio.

As you learned in Chapter 2, Istio, like other service meshes, introduces a new layer into modern infrastructure, creating the potential to implement robust and scalable applications with granular control over them. If you’re running microservices, these challenges are exacerbated as you deploy ever more microservices. You might not be running microservices, however. Even though the value of a service mesh shines most brightly in microservices deployments, Istio also readily accounts for services running directly on the OS running in your VM and bare-metal server.

Service Mesh Architecture

At a high level, service mesh architectures including Istio commonly comprise two planes: a control plane and data plane, while a third (management) plane might reside in incumbent/infrastructure systems. Istio’s architecture adheres to this paradigm. Figure 3-1 presents the divisions of concern by planes.

For a more thorough explanation of service mesh deployment models and approaches to evolutionary architectures, see The Enterprise Path to Service Mesh Architectures.

iuar 0301
Figure 3-1. Istio and other service meshes are composed of two planes. A third plane is commonly deployed to enable additional network intelligence and ease of management in heterogenous environments.

Planes

Istio’s data plane intercepts every packet in the request and is responsible for health checking, routing, load balancing, authentication, authorization, and generation of observable signals. Operating in band, service proxies are transparently inserted, and as applications make service-to-service calls, applications are unaware of the data plane’s existence. Data planes are responsible for intracluster communication as well as inbound (ingress) and outbound (egress) cluster network traffic. Whether traffic is entering or leaving the mesh, application service traffic is directed first to the service proxy for handling. With Istio, traffic is transparently intercepted using iptables rules and redirected to the service proxy. The Istio data plane touches every packet/request in the system and is responsible for service discovery, health checking, routing, load balancing, authentication, authorization, and observability.

The control plane in Istio provides a single point of administration for service proxies, which need programmatic configuration to be efficiently managed and have their configuration updated in real-time as services are rescheduled across your environment (i.e., container cluster). Control planes provide policy and configuration for services in the mesh, taking a set of isolated, stateless proxies and turning them into a service mesh. They do not directly touch any network packets in the mesh; control planes operate out of band. They typically have a CLI and user interface (UI) with which to interact, each of which provide access to a centralized API for holistically controlling proxy behavior. You can automate changes to control-plane configuration through its APIs (e.g., by a CI/CD pipeline), where, in practice, configuration is most often version controlled and updated.

The Istio control plane does the following:

  • Provides policy and configuration for services in the mesh via APIs for operators to specify desired routing/resilience behavior

  • Combines a set of isolated stateless sidecar proxies into a service mesh:

    • APIs for the data plane to consume localized configuration

    • Service discovery abstraction for the data plane

  • Uses APIs for specifying usage policies via quota and usage restrictions

  • Provides security via certificate issuance and rotation

  • Assigns workload identity

  • Handles routing configuration:

    • Doesn’t touch any packets/requests in the system

    • Specifies network boundaries and how to access them

    • Unifies telemetry collection

Istio Control-Plane Components

In this section, we introduce the functionality of each control-plane component at a high level. In later chapters, we’ll do a deep dive into each component’s behavior, configuration, and troubleshooting capability.

Pilot

Pilot is the head of the ship in an Istio mesh, so to speak. It stays synchronized with the underlying platform (e.g., Kubernetes) by tracking and representing the state and location of running services to the data plane. Pilot interfaces with your environment’s service discovery system, and produces configuration for the data-plane service proxies (we’ll examine istio-proxy as a data-plane component later).

As Istio evolves, more of Pilot’s focus will be the scalable serving of proxy configuration and less on interfacing with underlying platforms. Pilot serves Envoy-compatible configurations by coalescing configuration and endpoint information from various sources and translating this into xDS objects. Another component, Galley, will eventually take responsibility for interfacing directly with underlying platforms.

Galley

Galley is Istio’s configuration aggregation and distribution component. As its role evolves, it will insulate the other Istio components from underlying platform and user-supplied configurations by ingesting and validating configurations. Galley uses the Mesh Configuration Protocol (MCP) as a mechanism to serve and distribute configuration.

Mixer

Capable of standing on its own, Mixer is a control-plane component designed to abstract infrastructure backends from the rest of Istio, where infrastructure backends are things like Stackdriver or New Relic. Mixer bears responsibility for precondition checking, quota management, and telemetry reporting. It does the following:

  • Enables platform and environment mobility

  • Provides granular control over operational policies and telemetry by taking responsibility for policy evaluation and telemetry reporting

  • Has a rich configuration model

  • Abstracts away most infrastructure concerns with intent-based configuration

Service proxies and gateways invoke Mixer to do precondition checks to determine whether a request should be allowed to proceed (check), whether communication between the caller and the service is allowed or has exceeded quota, and to report telemetry after a request has completed (report). Mixer interfaces to infrastructure backends through a set of native and third-party adapters. Adapter configuration determines which telemetry is sent to which backend and when. Service mesh operators can use Mixer’s adapters as the point of integration and intermediation with their infrastructure backends as it operates as an attribute processing and routing engine.

Note

The Mixer v2 design that is currently underway proposes a significantly different architecture. However, its scope and focus are planned to remain much the same as in Mixer v1.

Citadel

Citadel empowers Istio to provide strong service-to-service and end-user authentication using mutual Transport Layer Security (mTLS), with built-in identity and credential management. Citadel’s CA component approves and signs certificate-signing requests (CSRs) sent by Citadel agents, and it performs key and certificate generation, deployment, rotation, and revocation. Citadel has an optional ability to interact with an identity directory during the CA process.

Citadel has a pluggable architecture in which different CAs can be used so that it’s not using its self-generated, self-signed signing key and certificate to sign workload certificates. The CA pluggability of Istio enables and facilitates the following:

  • Integrates with your organization’s public key infrastructure (PKI) system.

  • Secures communication between Istio and non-Istio legacy services (by sharing the same root of trust).

  • Secures the CA signing key by storing it in a well-protected environment (e.g., HashiCorp Vault, hardware security module, or HSM)

Service Proxy

You can use service mesh proxies to gate ingress network traffic, traffic between services, and traffic egressing services. Istio uses proxies between services and clients. Service proxies are usually deployed as sidecars in pods. (Examples of other deployment models can be found in the book, The Enterprise Path to Service Mesh Architectures.) The proxy-to-proxy communication is what truly forms the mesh. Inherently, it follows that for an application to be onboarded to the mesh, a proxy must be placed between the application and the network, as illustrated in Figure 3-2.

iuar 0302
Figure 3-2. Fully interconnected service proxies form the mesh.

A sidecar adds behavior to a container without changing it. In that sense, the sidecar and the service behave as a single, enhanced unit. The pods host the sidecar and service as a single unit.

Istio Data-Plane Components

Istio uses an extended version of the Envoy proxy, a high-performance proxy developed in C++, to mediate all inbound and outbound traffic for all services in the service mesh. Istio uses Envoy’s features such as dynamic service discovery, load balancing, TLS termination, HTTP/2 and gRPC Remote Procedure Call (gRPC) proxying, circuit breakers, health checks, staged rollouts with percent-based traffic split, fault injection, and rich metrics.

Envoy is deployed as a sidecar to the relevant service in the same Kubernetes pod. This allows Istio to extract a wealth of signals about traffic behavior as attributes, which in turn it can use in Mixer to enforce policy decisions and send to monitoring systems to provide information about the behavior of the entire mesh.

Injection

The sidecar proxy model also allows you to add Istio capabilities to an existing deployment with no need to redesign or rewrite code. This is a significant attraction to using Istio. The promises of an immediate view of top-level service metrics, of detailed control over traffic, and of automated authentication and encryption between all services without having to change your application code or change your deployment manifests.

Using Istio’s canonical sample application Bookinfo, makes clear how service proxies come into play and form a mesh. Figure 3-3 shows the Bookinfo application without the service proxies. (We take a closer look at Bookinfo and deploy it in Chapter 4.)

In Kubernetes, automatic proxy injection is implemented as a webhook using a Kubernetes API Server with the mutating webhook admission controller. It is stateless, depending only on the injection template and mesh configuration configmaps as well as the to-be-injected pod object. As such, it is easily horizontally scaled, either manually via the Deployment object, or automatically via a Horizontal Pod Autoscaler (HPA).

Injection of the sidecar proxy into a newly created pod takes an average of 1.5 µs (per microbenchmark) for the Webhook itself to execute. Total injection time will be higher when accounting for network latency and API Server processing time.

iuar 0303
Figure 3-3. Istio’s sample application, Bookinfo, shown without service proxies

Istio addresses the well-known distributed systems challenge of not having homogeneous, reliable, unchanging networks. It does so through deployment of lightweight proxies deployed between your application containers and the network. Figure 3-4 demonstrates how the full architecture of Istio includes the control and data planes with each of their internal components. A full-mesh deployment also includes ingress and egress service gateways.

iuar 0304
Figure 3-4. Bookinfo shown with service proxies

Gateways

Istio 0.8 introduced the concept of ingress and egress gateways. Symmetrically similar, ingress and egress gateways act as reverse and forward proxies, respectively, for traffic entering and exiting the mesh. Like other Istio components, the behavior of an Istio Gateway is defined and controlled through configuration, giving you control over which traffic to allow in and out of the service mesh, at what rate, and so on.

Ingress

Configuring ingress gateways enables you to define traffic entryways into the service mesh for incoming traffic to flow through. Consider that ingressing traffic into the mesh is a reverse proxy situation—akin to traditional web server load balancing. Configuration for egressing traffic out of the mesh is a forward proxy situation in which you identify which traffic to allow out of the mesh and where it should be routed.

As an example, the following gateway configuration sets up a proxy to act as a load balancer exposing port 80 and 9080 (HTTP), 443 (HTTPS), and port 2379 (TCP) for ingress. The gateway will be applied to the proxy running on a pod with labels app: my-gateway-controller. Even though Istio will configure the proxy to listen on these ports, it is the responsibility of the user to ensure that external traffic to these ports is allowed into the mesh (for more details, see Istio’s gateway documentation):

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: my-gateway
spec:
  selector:
    app: my-gateway-controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - uk.bookinfo.com
    - eu.bookinfo.com
    tls:
      httpsRedirect: true # sends 301 redirect for http requests
  - port:
      number: 443
      name: https
      protocol: HTTPS
    hosts:
    - uk.bookinfo.com
    - eu.bookinfo.com
    tls:
      mode: SIMPLE #enables HTTPS on this port
      serverCertificate: /etc/certs/servercert.pem
      privateKey: /etc/certs/privatekey.pem
  - port:
      number: 9080
      name: http-wildcard
      protocol: HTTP
    hosts:
    - "*"
  - port:
      number: 2379 # to expose internal service via external port 2379
      name: mongo
      protocol: MONGO
    hosts:
    - "*"

Egress

Traffic can exit an Istio service mesh in two ways: directly from the sidecar or funneled through an egress gateway, where you can apply traffic policy.

Note

By default, Istio-enabled applications are unable to access URLs external to the cluster.

Direct from a service proxy

If you want traffic destined to an external source to bypass the egress gateway, you can provide configuration to the ConfigMap of the istio-sidecar-injector. Set the following configuration in the sidecar injector, which will identify cluster-local networks and keep traffic destined locally within the mesh while forwarding traffic for all other destinations externally:

--set global.proxy.includeIPRanges="10.0.0.1/24"

After you’ve applied this and Istio proxies are updated with this configuration, external requests bypass the sidecar and route directly to the intended destination. The Istio sidecar will intercept and manage only internal requests within the cluster.

Route through an egress gateway

You may need to use an egress gateway in order to provide connectivity from your cluster’s private IP address space, monitoring, or cross-cluster connectivity.

An egress gateway allows Istio monitoring and route rules to be applied to traffic exiting the mesh. It also facilitates communication between applications running in a cluster where the nodes do not have public IP addresses, preventing applications in the mesh from accessing the internet. Defining an egress gateway, directing all the egress traffic through it, and allocating public IPs to the egress gateway nodes allows the nodes (and applications running on them) to access external services in a controlled way, as depicted in Figure 3-5.

iuar 0305
Figure 3-5. Istio’s architecture and components
Tip

Why Use Istio Gateways and Not Kubernetes Ingresses?

In general, the Istio v1alpha3 APIs use gateways for richer functionality given that Kubernetes Ingress has proven insufficient for Istio applications. Compared to Kubernetes Ingress, Istio gateways can operate as a pure Layer 4 (L4) TCP proxy and support all protocols that Envoy supports.

Another consideration is the separation of trust domains between organizational teams. The Kubernetes Ingress API merges specification for L4 to Layer 7 (L7), making it difficult for different teams in organizations with separate trust domains (like SecOps, NetOps, ClusterOps, and Developers) to own ingress traffic management.

Extensibility

Although not an explicit goal for some service meshes, Istio is designed to be customized. As an extensible platform, its integrations come in two primary forms: swappable sidecar proxies and telemetry/authorization adapters.

Customizable Sidecars

Within Istio, though Envoy is the default service proxy sidecar, it is possible to use another service proxy for your sidecar. Although there are multiple service proxies in the ecosystem, beyond Envoy, only two have currently demonstrated integration with Istio: Linkerd and NGINX. Linkerd2 is not currently designed as a general-purpose proxy; rather, it focuses on being lightweight, having extensibility as a secondary concern in offering extension via the gRPC plug-in.

Though it’s more likely that you’d choose to run a different service mesh altogether, you might want to use Istio with one of the following alternative service proxies:

Linkerd

You might want to use this if you’re already running Linkerd and want to begin adopting Istio control APIs like CheckRequest, which is used to get a thumbs-up/thumbs-down before performing an action.

NGINX

Based on your operational expertise and need for battle-tested proxies, you might select NGINX. You might be looking for caching, web application firewall, or other functionality available in NGINX Plus as well.

Consul Connect

You might choose to deploy Consul Connect based on ease of deployment and simplicity of needs.

The arrival of choice in service proxies for Istio has generated a lot of excitement. Linkerd’s integration was created early in Istio’s 0.1.6 release. Similarly, the ability to use NGINX as a service proxy through the nginMesh project was provided early in Istio’s release cycle.

Note

Although nginMesh is no longer in active development, you might find this article, “How to Customize an Istio Service Mesh”, and its related webcast helpful in better understanding Istio’s extensibility with respect to swappable service proxies.

Without configuration, proxies lack instructions to perform their tasks. Pilot is the head of the ship in an Istio mesh, so to speak, keeping synchronization with the underlying platform by tracking and representing its services to istio-proxy. As the default proxy, istio-proxy contains an extended version of Envoy. Typically, the same istio-proxy Docker image is used by Istio sidecar and Istio ingress and egress gateways. istio-proxy contains not only the service proxy, but also the Istio Pilot agent, which pulls configuration down from Pilot to the service proxy at frequent intervals so that each proxy knows where to route traffic. In this case, nginMesh’s translator agent performs the task of configuring NGINX as the istio-proxy. Pilot is responsible for the life cycle of istio-proxy.

Extensible Adapters

Istio’s Mixer control-plane component is responsible for enforcing access control and usage policies across the service mesh and collecting telemetry data from the sidecar proxy. As Istio’s main point of extensibility, Mixer categorizes adapters based on the type of data they consume, as illustrated in Figure 3-6.

The In-Process Adapter authoring model for Mixer adapters is now a deprecated concept in Istio. Like other open source projects before it, Istio began with conveniently incorporating adapters in-tree. As Istio has evolved and matured, this model has been converted to that of keeping adapters separate from the main project so as to remove the burden on the core project teams and encourage ownership by what are typically separate development teams that have created an integration to other backend systems.

Future extensibility might come in the form of secure key stores for HSMs and better support for swapping out distributed tracing infrastructure backends. Additionally, we expect management planes to play a more prominent role as Istio and other service meshes are adopted.

iuar 0306
Figure 3-6. Mixer acts as an attribute processing engine, collecting, transforming, and transmitting telemetry.

Scale and Performance

Like many, you might be saying to yourself, “These features are great, but what’s the overhead of running a service mesh?” It’s true that these features come at a cost. Running a proxy per service and intercepting each packet takes a certain amount of continual overhead. Costs can be analyzed by data and control planes. And answers to performance questions always begin with “it depends.” Here, depending on how many of Istio’s features you use, the resources needed vary. Some cost/resources are as follows:

  • 1 virtual CPU (vCPU) per peak thousand requests per second for the sidecar(s) with access logging (which is off by default in v1.1) and 0.5 without. Fluentd on the node is a big contributor to that cost because it captures and uploads logs.

  • Assuming a typical cache hit ratio (>80%) for Mixer checks: 0.5 vCPU per peak thousand requests per second for the mixer pods.

  • Latency of approximately 8 ms is added to the 90th percentile latency.

  • Mutual TLS (mTLS) costs are negligible on AES-NI capable hardware in terms of both CPU and latency.

Overhead for both the data plane and control plane is a common concern in the minds of those adopting a service mesh. As far as the data plane is concerned, maintainers of Envoy understand that this service proxy is in the critical path and have worked to tune its performance. Envoy has first-class support for HTTP/2 and TLS in either direction and minimizes overhead by multiplexing requests and responses across single, long-lived TCP connections. In this way, its support of HTTP/2 achieves lower latency than HTTP/1.1 otherwise would.

Although Envoy project maintainers do not currently publish any official performance benchmarks, they encourage users to benchmark it in their own environments with a configuration similar to what you plan on using in production. To fill this void, tools like Meshery have cropped up within the open source community. Meshery is an open source, multiservice mesh-management plane that provisions different service meshes and sample applications and benchmarks the performance of service mesh deployments. It facilitates benchmarking various configuration scenarios of Istio, as well as comparison of performance of services (applications) on and off the mesh and across meshes. It vets mesh and service configuration against deployment best practices. Some service mesh projects use Meshery as their release benchmark tool. It is complemented by other load-generation tools commonly used for service mesh performance testing.

In larger deployments, as the service proxy (Envoy) fleet grows, the central role the control plane plays can become a bottleneck or source of latency. For example, depending on the verbosity of the instrumentation and rate of sampling, trace data can exceed the volume of the actual business traffic sustained by an application. Collection of this telemetry, and sending it to the tracing backend (directly or via the control plane), can have a real effect on the application’s latency and throughput.

Deployment Models

Istio supports different deployment models, some of which deploy only select components of its architecture. Figure 3-7 shows a full Istio deployment in all its glory.

iuar 0307
Figure 3-7. Istio deployment on Kubernetes

Service meshes come in various shapes and sizes. To explore other mesh deployment models, see The Enterprise Path to Service Mesh Architectures. You’re learning more about Istio, a service mesh, because you’re ready to layer up—ready for enhanced service management. Istio aims to directly address service management challenges by providing a new layer of visibility, security, and control.

In this chapter, we’ve covered how the Istio control plane provides a single point of visibility and control, while the data plane facilitates traffic routing. Also, Istio is an example of a service mesh designed with customizability in mind. Finally, we covered how decoupling of control and unification of responsibility over microservices as a new layer of infrastructure, L5, avoids finger-pointing between Dev and Ops teams.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset