Chapter 15. Case studies, part 2

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 15. Case studies, part 2

This chapter covers

Facebook case study
Twitter case study

In this chapter we’ll see how Facebook and Twitter, two of the most popular social networks, are using Netty. Each has exploited Netty’s flexible and generic design to build frameworks and services that meet requirements for extreme scalability and extensibility.

The case studies presented here were written by the engineers responsible for the design and implementation of the solutions described.

15.1. Netty at Facebook: Nifty and SwiftThe views expressed in this sec- ction are those of the author and do not necessarily reflect the views of the author’s employer.

Andrew Cox, Software Engineer at Facebook

At Facebook we use Netty in several of our back-end services (for handling messaging traffic from mobile phone apps, for HTTP clients, and so on), but our fastest-growing usage is via two new frameworks we’ve developed for building Thrift services in Java: Nifty and Swift.

15.1.1. What is Thrift?

Thrift is a framework for building services and clients that communicate via remote procedure calls (RPC). It was originally developed at Facebook^[2] to meet our requirements for building services that can handle certain types of interface mismatches between client and server. This comes in very handy because services and their clients usually can’t all be upgraded simultaneously.

²
A now-ancient whitepaper from the original Thrift developers can be found at http://thrift.apache.org/static/files/thrift-20070401.pdf.

Another important feature of Thrift is that it’s available for a wide variety of languages. This enables teams at Facebook to choose the right language for the job, without worrying about whether they’ll be able to find client code for interacting with other services. Thrift has grown to become one of the primary means by which our back-end services at Facebook communicate with one another, and it’s also used for non-RPC serialization tasks, because it provides a common, compact storage format that can be read from a wide selection of languages for later processing.

Since its development at Facebook, Thrift has been open sourced as an Apache project (http://thrift.apache.org/), where it continues to grow to fill the needs of service developers, not only at Facebook but also at other companies, including Evernote and last.fm,^[3] and on major open source projects such as Apache Cassandra and HBase.

³
Find more examples at http://thrift.apache.org.

These are the major components of Thrift:

Thrift Interface Definition Language (IDL) —Used to define your services and compose any custom types that your services will send and receive
Protocols —Used to control encoding/decoding elements of data into a common binary format (such as Thrift binary protocol or JSON)
Transports —Provides a common interface for reading/writing to different media (such as TCP socket, pipe, memory buffer)
Thrift compiler —Parses Thrift IDL files to generate stub code for the server and client interfaces, and serialization/deserialization code for the custom types defined in IDL
Server implementation —Handles accepting connections, reading requests from those connections, dispatching calls to an object that implements the interface, and sending the responses back to clients
Client implementation —Translates method calls into requests and sends them to the server

15.1.2. Improving the state of Java Thrift using Netty

The Apache distribution of Thrift has been ported to about twenty different languages, and there are also separate frameworks compatible with Thrift built for other languages (Twitter’s Finagle for Scala is a great example). Several of these languages receive at least some use at Facebook, but the most common ones used for writing Thrift services here at Facebook are C++ and Java.

When I arrived at Facebook, we were already well underway with the development of a solid, high-performance, asynchronous Thrift implementation in C++, built around libevent. From libevent, we get cross-platform abstractions over the OS APIs for asynchronous I/O, but libevent isn’t any easier to use than, say, raw Java NIO. So we’ve also built abstractions on top of that, such as asynchronous message channels, and we make use of chained buffers from Folly^[4] to avoid copies as much as possible. This framework also has a client implementation that supports asynchronous calls with multiplexing, and a server implementation that supports asynchronous request handling. (The server can start an asynchronous task to handle a request and return immediately, then invoke a callback or set a Future later when the response is ready.)

⁴
Folly is Facebook’s open-source C++ common library: https://www.facebook.com/notes/facebook-engineering/folly-the-facebook-open-source-library/10150864656793920.

Meanwhile, our Java Thrift framework received a lot less attention, and our load-testing tools showed that Java performance lagged well behind C++. There were already Java Thrift frameworks built on NIO, and asynchronous NIO-based clients were available as well. But the clients didn’t support pipelining or multiplexing requests, and the servers didn’t support asynchronous request handling. Because of these missing features, Java Thrift service developers here at Facebook were running into problems that had been already solved in C++, and it became a source of frustration.

We could have built a similar custom framework on top of NIO and based our new Java Thrift implementation on that, as we had done for C++. But experience showed us that this was a ton of work to get right, and as it happened, the framework we needed was already out there, just waiting for us to make use of it: Netty.

We quickly put together a server implementation and mashed the names “Netty” and “Thrift” together to come up with “Nifty,” the name for the new server. It was immediately impressive how much less code was needed to get Nifty working, compared to everything we needed to achieve the same results in C++.

Next we put together a simple load-tester Thrift server using Nifty and used our load-testing tools to compare it to existing servers. The results were clear: Nifty outperformed the other NIO servers, and it was in the same ballpark as our newest C++ Thrift server. Using Netty was going to improve performance!

15.1.3. Nifty server design

Nifty (https://github.com/facebook/nifty) is an open source, Apache-licensed Thrift client/server implementation built on top of the Apache Thrift library. It’s designed so that moving from any other Java Thrift server implementation should be painless: you can reuse the same Thrift IDL files, the same Thrift code generator (packaged with the Apache Thrift library), and the same service interface implementation. The only thing that really needs to change is your server startup code (Nifty setup follows a slightly different style from that of the traditional Thrift server implementations in Apache Thrift).

Nifty encoder/decoder

The default Nifty server handles either plain messages or framed messages (with a 4-byte prefix). It does this by using a custom Netty frame decoder that looks at the first few bytes to determine how to decode the rest. Then, when a complete message is found, the decoder wraps the message content along with a field that indicates the type of message. The server later refers to this field to encode the response in the same format.

Nifty also supports plugging in your own custom codec. For example, some of our services use a custom codec to read extra information from headers that clients insert before each message (containing optional metadata, client capabilities, and so on). The decoder could also easily be extended to handle other types of message transports, such as HTTP.

Ordering responses on the server

Initial versions of Java Thrift used OIO sockets, and servers maintained one thread per active connection. With this setup, each request was read, processed, and answered, all on the same thread, before the next response was read. This guaranteed that responses would always be returned in the order in which the corresponding requests arrived.

Newer asynchronous I/O server implementations were built that didn’t need one thread per connection, and these servers could handle more simultaneous connections, but clients still mainly used synchronous I/O, so the server could count on not receiving the next request until after it had sent the current response. This request/execution flow is shown in figure 15.1.

Figure 15.1. Synchronous request/response flow

Initial pseudo-asynchronous usages of clients started happening when a few Thrift users took advantage of the fact that for a generated client method foo(), methods send_foo() and recv_foo() were also exposed separately. This allows Thrift users to send several requests (whether on several clients, or on the same client) and then call the corresponding receive methods to start waiting for and collecting the results.

In this new scenario, the server may read multiple requests from a single client before it has finished processing the first. In an ideal world, we could assure all asynchronous Thrift clients that pipeline requests can handle the responses to those requests in whatever order they arrive. In the world we live in, though, newer clients can handle this, whereas older asynchronous Thrift clients may write multiple requests but must receive the responses in order.

This kind of problem is solved by using the Netty 4 EventExecutor or OrderedMemoryAwareThreadPoolExcecutor in Netty 3.x, which guarantee sequential processing for all incoming messages on a connection, without forcing all of those messages to run on the same executor thread.

Figure 15.2 shows how pipelined requests are handled in the correct order, which means the response for the first request will be returned, and then the response for the second, and so on.

Figure 15.2. Request/response flow for sequential processing of pipelined requests

Nifty has special requirements though: we aim to serve each client with the best response ordering that it can handle. We’d like to allow the handlers for multiple pipelined requests from a single connection to be processed in parallel, but then we couldn’t control the order in which these handlers would finish.

Instead we use a solution that involves buffering responses; if the client requires in-order responses, we’ll buffer later responses until all the earlier ones are also available, and then we’ll send them together, in the required order. See figure 15.3.

Figure 15.3. Request/response flow for parallel processing of pipelined requests

Of course, Nifty includes asynchronous channels (usable through Swift) that do support out-of-order responses. When using a custom transport that allows the client to notify the server of this client capability, the server is relieved of the burden of buffering responses, and it will send them back in whatever order the requests finish.

15.1.4. Nifty asynchronous client design

Nifty client development is mostly focused on asynchronous clients. Nifty actually does provide a Netty implementation of Thrift’s synchronous transport interface, but its use is pretty limited because it doesn’t provide much win over a standard socket transport from Thrift. Because of this, the user should use the asynchronous clients whenever possible.

Pipelining

The Thrift library has its own NIO-based asynchronous client implementation, but one feature we wanted was request pipelining. Pipelining is the ability to send multiple requests on the same connection without waiting for a response. If the server has idle worker threads, it can process these requests in parallel, but even if all worker threads are busy, pipelining can still help in other ways. The server will spend less time waiting for something to read, and the client may be able to send multiple small requests together in a single TCP packet, thus better utilizing network bandwidth.

With Netty, pipelining just works. Netty does all the hard work of managing the state of the various NIO selection keys, and Nifty can focus on encoding requests and decoding responses.

Multiplexing

As our infrastructure has grown, we’ve started to see a lot of connections building up on our servers. Multiplexing—sharing connections for all the Thrift clients connecting from a single source—can help to mitigate this. But multiplexing over a client connection that requires ordered responses presents a problem: one client on the connection may incur extra latency because its response must come after the responses for other requests sharing the connection.

The basic solution is pretty simple: Thrift already sends a sequence identifier with every message, so to support out-of-order responses we just need the client channels to keep a map from sequence ID to response handler, instead of using a queue.

The catch is that in standard synchronous Thrift clients, the protocol is responsible for extracting the sequence identifier from the message, and the protocol calls the transport, but never the other way around.

That simple flow (shown in figure 15.4) works fine for a synchronous client, where the protocol can wait on the transport to actually receive the response, but for an asynchronous client the control flow gets a bit more complicated. The client call is dispatched to the Swift library, which first asks the protocol to encode the request into a buffer, and then passes that encoded request buffer to the Nifty channel to be written out. When the channel receives a response from the server, it notifies the Swift library, which again uses the protocol to decode the response buffer. This is the flow shown in figure 15.5.

Figure 15.4. Multiplexing/transport layers

Figure 15.5. Dispatching

15.1.5. Swift: a faster way to build Java Thrift service

The other key part of our new Java Thrift framework is called Swift. It uses Nifty as its I/O engine, but the service specifications can be represented directly in Java using annotations, giving Thrift service developers the ability to work purely in Java. When your service starts up, the Swift runtime gathers information about all the services and types via a combination of reflection and interpreting Swift annotations. From that information, it can build the same kind of model that the Thrift compiler builds when parsing Thrift IDL files. Then it uses this model to run the server and client directly (without any generated server or client stub code) by generating new classes from byte code used for serializing/deserializing the custom types.

Skipping the normal Thrift code generation also makes it easier to add new features without having to change the IDL compiler, so a lot of our new features (such as asynchronous clients) are supported in Swift first. If you’re interested, take a look at the introductory information on Swift’s GitHub page (https://github.com/facebook/swift).

15.1.6. Results

In the following sections we’ll quantify some of the outcomes we’ve seen from our work with Netty.

Performance comparisons

One measurement of Thrift server performance is a benchmark of no-ops. This benchmark uses long-running clients that continuously make Thrift calls to a server that sends back an empty response. Although this measurement isn’t a realistic performance estimation of most actual Thrift services, it’s a good measure of the maximum potential of a Thrift service, and improving this benchmark does generally mean a reduction in the amount of CPU used by the framework itself.

As shown in table 15.1, Nifty outperforms all of the other NIO Thrift server implementations (TNonblockingServer, TThreadedSelectorServer, and TThreadPoolServer) on this benchmark. It even easily beats our previous Java server implementation (a pre-Nifty server implementation we used internally, based on plain NIO and direct buffers).

Table 15.1. Benchmark results for different implementations

Thrift server implementation	No-op requests/second
TNonblockingServer	~68,000
TThreadedSelectorServer	188,000
TThreadPoolServer	867,000
Older Java server (using NIO and direct buffers)	367,000
Nifty	963,000
Older libevent-based C++ server	895,000
Next-gen libevent-based C++ server	1,150,000

The only Java server we tested that can compete with Nifty is TThreadPoolServer. This server uses raw OIO and runs each connection on a dedicated thread. This gives it an edge when handling a lower number of connections; however, you can easily run into scaling problems with OIO when your server needs to handle a very large number of simultaneous connections.

Nifty even beats the previous C++ server implementation that was most prominent when we started development on Nifty, and although it falls a bit short compared to our next-gen C++ server framework, it’s at least in the same ballpark.

Example stability issues

Before Nifty, many of our major Java services at Facebook used an older, custom NIO-based Thrift server implementation that works similarly to Nifty. That implementation is an older codebase that had more time to mature, but because its asynchronous I/O handling code was built from scratch, and because Nifty is built on the solid foundation of Netty’s asynchronous I/O framework, it has had many fewer problems.

One of our custom message queuing services had been built using the older framework, and it started to suffer from a kind of socket leak. A lot of connections were sitting around in CLOSE_WAIT state, meaning the server had received a notification that the client had closed the socket, but the server never reciprocated by making its own call to close the socket. This left the sockets in a kind of CLOSE_WAIT limbo.

The problem happened very slowly; across the entire pool of machines handling this service, there might be millions of requests per second, but usually only one socket on one server would enter this state in an hour. It wasn’t an urgent issue because it took a long time before a server needed a restart at that rate, but it also complicated tracking down the cause. Extensive digging through the code didn’t help much either: initially several places looked suspicious, but everything ultimately checked out and we didn’t locate the problem.

Eventually we migrated the service onto Nifty. The conversion—including testing in a staging environment—took less than a day and the problem has since disappeared. We haven’t really seen any such problems in Nifty.

This is just one example of the kind of subtle bug that can show up when using NIO directly, and it’s similar to bugs we’ve had to solve in our C++ Thrift framework time and time again to stabilize it. But I think it’s a great example of how using Netty has helped us take advantage of the years of stability fixes it has received.

Improving timeout handling for C++

Netty has also helped us indirectly by lending suggestions for improvements to our C++ framework. An example of this is the hashed wheel timer. Our C++ framework uses timeout events from libevent to drive client and server timeouts, but adding separate timeouts for every request proves to be prohibitively expensive, so we’d been using what we called timeout sets. The idea here was that a client connection to a particular service usually has the same receive timeout for every call made from that client, so we’d maintain only one real timer event for a set of timeouts that share the same duration. Every new timeout was guaranteed to fire after existing timeouts scheduled for that set, so when each timeout expired or was canceled, we’d schedule only the next timeout.

However, our users occasionally wanted to supply per-call timeouts, with different timeout values for different requests on the same connection. In this scenario, the benefits of using a timeout set are lost, so we tried using individual timer events. We started to see performance problems when many timeouts were scheduled at once. We knew that Nifty doesn’t run into this problem, despite the fact that it doesn’t use timeout sets—Netty solves this problem with its HashedWheelTimer.^[5] So with inspiration from Netty, we put together a hashed wheel timer for our C++ Thrift framework as well, and it has resolved the performance issue with variable per-request timeouts.

⁵
For more information about class HashedWheelTimer see http://netty.io/4.0/api/io/netty/util/HashedWheelTimer.html.

Future improvements on Netty 4

Nifty is currently running on Netty 3, which has been great for us so far, but we have a Netty 4 port ready that we’ll be moving to very soon, now that v4 has been finalized. We are eagerly looking forward to some of the benefits the Netty 4 API will offer us.

One example of how we plan to make better use of Netty 4 is achieving better control over which thread manages a given connection. We hope to use this feature to allow server handler methods to start asynchronous client calls from the same I/O thread the server call is running on. This is something that specialized C++ servers are already able to take advantage of (for example, a Thrift request router).

Extending from that example, we also look forward to being able to build better client connection pools that are able to migrate existing pooled connections to the desired I/O worker thread, which wasn’t possible in v3.

15.1.7. Facebook summary

With the help of Netty, we’ve been able to build a better Java server framework that nearly matches the performance of our fastest C++ Thrift server framework. We’ve migrated several of our existing major Java services onto Nifty already, solving some pesky stability and performance problems, and we’ve even started to feed back some ideas from Netty, and from the development of Nifty and Swift, into improving aspects of C++ Thrift.

On top of that, Netty has been a pleasure to work with and has made a lot of new features, like built-in SOCKS support for Thrift clients, simple to add.

But we’re not done yet. We’ve got plenty of performance tuning work to do, as well as plenty of other improvements planned for the future. If you’re interested in Thrift development using Java, be sure to keep an eye out!

15.2. Netty at Twitter: Finagle

Jeff Smick, Software engineer at Twitter

Finagle is Twitter’s fault-tolerant, protocol-agnostic RPC framework built atop Netty. All of the core services that make up Twitter’s architecture are built on Finagle, from back ends serving user information, tweets, and timelines to front-end API endpoints handling HTTP requests.

15.2.1. Twitter’s growing pains

Twitter was originally built as a monolithic Ruby on Rails application, semi-affectionately called The Monorail. As Twitter started to experience massive growth, the Ruby runtime and Rails framework started to become a bottleneck. From a compute standpoint, Ruby was relatively inefficient with resources. From a development standpoint, The Monorail was becoming difficult to maintain. Modifications to code in one area would opaquely affect another area. Ownership of different aspects of the code was unclear. Small changes unrelated to core business objects required a full deploy. Core business objects didn’t expose clear APIs, which increased the brittleness of internal structures and the likelihood of incidents.

We decided to split The Monorail into distinct services with clear owners and clear APIs allowing for faster iteration and easier maintenance. Each core business object would be maintained by a specific team and be served by its own service. There was precedent within the company for developing on the JVM—a few core services had already been moved out of The Monorail and had been rebuilt in Scala. Our operations teams had a background in JVM services and knew how to operationalize them. Given that, we decided to build all new services on the JVM using either Java or Scala. Most services decided on Scala as their JVM language of choice.

15.2.2. The birth of Finagle

In order to build out this new architecture, we needed a performant, fault-tolerant, protocol-agnostic, asynchronous RPC framework. Within a service-oriented architecture, services spend most of their time waiting for responses from other upstream services. Using an asynchronous library allows services to concurrently process requests and take full advantage of the hardware. Although Finagle could have been built directly on top of NIO, Netty had already solved many of the problems we would have encountered, and it provided a clean, clear API.

Twitter is built atop several open source protocols, primarily HTTP, Thrift, Memcached, MySQL, and Redis. Our network stack would need to be flexible enough that it could speak any of these protocols and extensible enough that we could easily add more. Netty isn’t tied to any particular protocol. Adding to it is as simple as creating the appropriate ChannelHandlers. This extensibility has led to many community-driven protocol implementations including SPDY,^[6] PostrgreSQL, WebSockets, IRC, and AWS.

⁶
For more information about SPDY see https://github.com/twitter/finagle/tree/master/finagle-spdy. About PostgreSQL: https://github.com/mairbek/finagle-postgres. About WebSockets: https://github.com/sprsquish/finagle-websocket. About IRC: https://github.com/sprsquish/finagle-irc. About AWS: https://github.com/sclasen/finagle-aws.

Netty’s connection management and protocol agnosticism provided an excellent base from which Finagle could be built. But we had a few other requirements Netty couldn’t satisfy out of the box, as those requirements were more high-level. Clients needed to connect and load balance across a cluster of servers. All services needed to export metrics (request rates, latencies, and so on) that provide valuable data for debugging service behavior. With a service-oriented architecture, a single request may go through dozens of services, making debugging performance issues nearly impossible without a Dapper-inspired tracing framework.^[7] Finagle was built to solve these problems.

⁷
Info on Dapper can be found at http://research.google.com/pubs/pub36356.html. The tracing framework is Zipkin, found at https://github.com/twitter/zipkin.

15.2.3. How Finagle works

Internally Finagle is very modular. Components are written independently and then stacked together. Each component can be swapped in or out, depending on the provided configuration. For instance, tracers all implement the same interface, so a tracer can be created to send tracing data to a local file, hold it in memory and expose a read endpoint, or write it out to the network.

At the bottom of a Finagle stack is a Transport. This class is a representation of a stream of objects that can be asynchronously read from and written to. Transports are implemented as Netty ChannelHandlers and inserted into the end of a ChannelPipeline. Messages come in from the wire where Netty picks them up, runs them through the ChannelPipeline where they’re interpreted by a codec, and then sent to the Finagle Transport. From there Finagle reads the message off the Transport and sends it through its own stack.

For client connections, Finagle maintains a pool of transports across which it can load-balance. Depending on the semantics of the provided connection pool, Finagle will either request a new connection from Netty or reuse an existing one. When a new connection is requested, a Netty ChannelPipeline is created based on the client’s codec. Extra ChannelHandlers are added to the ChannelPipeline for stats, logging, and SSL. The connection is then handed to a channel transport that Finagle can write to and read from.

On the server side, a Netty server is created and then given a ChannelPipelineFactory that manages the codec, stats, timeouts, and logging. The last ChannelHandler in a server’s ChannelPipeline is a Finagle bridge. The bridge will watch for new incoming connections and create a new Transport for each one. The Transport wraps the new channel before it’s handed to a server implementation. Messages are then read out of the ChannelPipeline and sent to the implemented server instance.

Figure 15.6 shows the relationship between the Finagle client and server.

Figure 15.6. Netty use

Netty/Finagle bridge

This listing shows a static ChannelFactory with default options.

Listing 15.1. Setting up the `ChannelFactory`

This ChannelFactory bridges a Netty channel with a Finagle Transport (stats code has been removed here for brevity). When invoked via apply, this will create a new Channel and Transport. A Future is returned that is fulfilled when the Channel has either connected or failed to connect.

The next listing shows the ChannelConnector, which connects a Channel to a remote host.

Listing 15.2. Connecting to a remote host

This factory is provided a ChannelPipelineFactory, which is a channel factory and transport factory. The factory is invoked via the apply method. Once invoked, a new ChannelPipeline is created (newPipeline). That pipeline is used by the ChannelFactory to create a new Channel, which is then configured with the provided options (newConfiguredChannel). The configured channel is passed to a ChannelConnector as an anonymous factory. The connector is invoked and Future[Transport] is returned.

The following listing shows the details.^[8]

⁸
Finagle source code is at https://github.com/twitter/finagle.

Listing 15.3. Netty3-based transport

Finagle servers use Listeners to bind themselves to a given address. In this case the listener is provided a ChannelPipelineFactory, a ChannelFactory, and various options (excluded here for brevity). Listener is invoked with an address to bind to and a Transport to communicate over. A Netty ServerBootstrap is created and configured. Then an anonymous ServerBridge factory is created and passed to a ChannelPipelineFactory, which is given to the bootstrapped server. Finally the server is bound to the given address.

Now let’s look at the Netty-based implementation of the Listener.

Listing 15.4. Netty-based `Listener`

When a new channel is opened, the bridge creates a new ChannelTransport and hands it back to the Finagle server. This listing shows the code needed.^[9]

⁹
The complete source is at https://github.com/twitter/finagle.

Listing 15.5. Bridging Netty and Finagle

15.2.4. Finagle’s abstraction

Finagle’s core concept is a simple function (functional programming is the key here) from Request to Future of Response.

type Service[Req, Rep] = Req => Future[Rep]

This simplicity allows for very powerful composition. Service is a symmetric API representing both the client and the server. Servers implement the service interface. The server can be used concretely for testing, or Finagle can expose it on a network interface. Clients are provided an implemented service that’s either virtual or a concrete representation of a remote server.

For example, we can create a simple HTTP server by implementing a service that takes an HttpReq and returns a Future[HttpRep] representing an eventual response.

val s: Service[HttpReq, HttpRep] = new Service[HttpReq, HttpRep] {
    def apply(req: HttpReq): Future[HttpRep] =
        Future.value(HttpRep(Status.OK, req.body))
}
Http.serve(":80", s)

A client is then provided a symmetric representation of that service.

val client: Service[HttpReq, HttpRep] =  Http.newService("twitter.com:80")
val f: Future[HttpRep] = client(HttpReq("/"))
f map { rep => processResponse(rep) }

This example exposes the server on port 80 of all interfaces and consumes from twitter.com port 80.

We can also choose not to expose the server and instead use it directly.

server(HttpReq("/")) map { rep => processResponse(rep) }

Here the client code behaves the same way but doesn’t require a network connection. This makes testing clients and servers very simple and straightforward.

Clients and servers provide application-specific functionality. But there’s a need for application-agnostic functionality as well. Timeouts, authentication, and statics are a few examples. Filters provide an abstraction for implementing application-agnostic functionality.

Filters receive a request and a service with which it is composed:

type Filter[Req, Rep] = (Req, Service[Req, Rep]) => Future[Rep]

Filters can be chained together before being applied to a service:

recordHandletime andThen
traceRequest andThen
collectJvmStats andThen
myService

This allows for clean abstractions of logic and good separation of concerns. Internally, Finagle heavily uses filters, which help to enhance modularity and reusability. They’ve proved valuable for testing as they can be unit-tested in isolation with minimal mocking.

Filters can modify both the data and type of requests and responses. Figure 15.7 shows a request making its way through a filter chain into a service and back out.

Figure 15.7. Request/response flow

We might use type modification for implementing authentication.

val auth: Filter[HttpReq, AuthHttpReq, HttpRes, HttpRes] =?
    { (req, svc) => authReq(req) flatMap { authReq => svc(authReq) } }

val authedService: Service[AuthHttpReq, HttpRes] = ...
val service: Service[HttpReq, HttpRes] =?
    auth andThen authedService

Here we have a service that requires an AuthHttpReq. To satisfy the requirement, a filter is created that can receive an HttpReq and authenticate it. The filter is then composed with the service yielding a new service that can take an HttpReq and produce an HttpRes. This allows us to test the authenticating filter in isolation from the service.

15.2.5. Failure management

We operate under the assumption of failure; hardware will fail, networks will become congested, network links fail. Libraries capable of extremely high throughput and extremely low latency are meaningless if the systems they’re running on or are communicating with fail. To that end, Finagle is set up to manage failures in a principled way. It trades some throughput and latency for better failure management.

Finagle can balance load across a cluster of hosts implicitly using latency as a heuristic. Finagle clients locally track load on every host it knows about by counting the number of outstanding requests being dispatched to a single host. Given that, Finagle will dispatch new requests to hosts with the lowest load and, implicitly, the lowest latency.

Failed requests will cause Finagle to close the connection to the failing host and remove it from the load balancer. In the background, Finagle will continuously try to reconnect. The host will be re-added to the load balancer only after Finagle can reestablish a connection. Service owners are then free to shut down individual hosts without negatively impacting downstream clients.

15.2.6. Composing services

Finagle’s service-as-a-function philosophy allows for simple but expressive code. For example, a user making a request for their home timeline touches a number of services, the core of which are the authentication service, timeline service, and tweet service. These relationships can be expressed succinctly.

Listing 15.6. Composing services via Finagle

Here we create clients for the timeline service, tweet service, and authentication service. A filter is created for authenticating raw requests. Finally our service is implemented, combined with the auth filter, and exposed on port 80.

When a request is received, the auth filter will attempt to authenticate it. A failure will be returned immediately without ever affecting the core service. Upon successful authentication, the AuthReq will be sent to the API service. The service will use the attached userId to look up the user’s timeline via the timeline service. A list of tweet IDs is returned and then iterated over. Each ID is then used to request the associated tweet. Finally, the list of tweet requests is collected and converted into a JSON response.

As you can see, the flow of data is defined, and we leave the concurrency to Finagle. We don’t have to manage thread pools or worry about race conditions. The code is clear and safe.

15.2.7. The future: Netty

We’ve been working closely with the Netty maintainers to improve on parts of Netty from which both Finagle and the wider community can benefit.^[10] Recently, the internal structure of Finagle has been updated to be more modular, paving the way for an upgrade to Netty 4.

¹⁰
“Netty 4 at Twitter: Reduced GC Overhead,” https://blog.twitter.com/2013/netty-4-at-twitter-reduced-gc-overhead.

15.2.8. Twitter summary

Finagle has yielded excellent results. We’ve managed to dramatically increase the amount of traffic we can serve while reducing latencies and hardware requirements. For instance, after moving our API endpoints from the Ruby stack onto Finagle, we saw latencies drop from hundreds of milliseconds to tens while reducing the number of machines required from triple to single digits. Our new stack has enabled us to reach new records in throughput. As of this writing, our record tweets per second is 143,199.^[11] That number would have been unthinkable on our old architecture.

¹¹
“New Tweets per second record, and how!” https://blog.twitter.com/2013/new-tweets-per-second-record-and-how.

Finagle was born out of a need to set Twitter up to scale out to billions of users across the entire globe at a time when keeping the service up for just a few million was a daunting task. Using Netty as a base, we were able to quickly design and build Finagle to manage our scaling challenges. Finagle and Netty handle every request Twitter sees.

15.3. Summary

This chapter provides insight into how large companies such as Facebook and Twitter build software using Netty to guarantee the highest levels of performance and flexibility.

Facebook’s Nifty project shows how Netty was used to replace an existing Thrift implementation by providing custom protocol encoders and decoders.
Twitter’s Finagle shows how you can build your own high-performance framework on top of Netty and enhance it with features such as load-balancing and failover.

We hope the case studies presented here will serve as sources of information and also inspiration as you build your next-generation masterpiece.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 15. Case studies, part 2

Create new playlist

Sign In

Sign Up

Chapter 15. Case studies, part 2

15.1. Netty at Facebook: Nifty and SwiftThe views expressed in this sec- ction are those of the author and do not necessarily reflect the views of the author’s employer.

15.1.1. What is Thrift?

15.1.2. Improving the state of Java Thrift using Netty

15.1.3. Nifty server design

Nifty encoder/decoder

Ordering responses on the server

Figure 15.1. Synchronous request/response flow

Figure 15.2. Request/response flow for sequential processing of pipelined requests

Figure 15.3. Request/response flow for parallel processing of pipelined requests

15.1.4. Nifty asynchronous client design

Pipelining

Multiplexing

Figure 15.4. Multiplexing/transport layers

Figure 15.5. Dispatching

15.1.5. Swift: a faster way to build Java Thrift service

15.1.6. Results

Performance comparisons

Table 15.1. Benchmark results for different implementations

Example stability issues

Improving timeout handling for C++

Future improvements on Netty 4

15.1.7. Facebook summary

15.2. Netty at Twitter: Finagle

15.2.1. Twitter’s growing pains

15.2.2. The birth of Finagle

15.2.3. How Finagle works

Figure 15.6. Netty use

Netty/Finagle bridge

Listing 15.1. Setting up the ChannelFactory

Listing 15.2. Connecting to a remote host

Listing 15.3. Netty3-based transport

Listing 15.4. Netty-based Listener

Listing 15.5. Bridging Netty and Finagle

15.2.4. Finagle’s abstraction

Figure 15.7. Request/response flow

15.2.5. Failure management

15.2.6. Composing services

Listing 15.6. Composing services via Finagle

15.2.7. The future: Netty

15.2.8. Twitter summary

15.3. Summary

Table of Contents for
Chapter 15. Case studies, part 2

Listing 15.1. Setting up the `ChannelFactory`

Listing 15.4. Netty-based `Listener`